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METHOD AND APPARATUS FOR 
MANAGING STREAMING DATA 

CROSS REFERENCE TO RELATED APPLICATIONS 

This application claims priority from provisional patent application 60/157,998 

entitled STREAMING MACHINE-AWARE BINARY AND MULTIMEDIA CONTENT 
FOR MULTIMEDIA SYNCHRONIZATION filed 10/06/99. 

The above referenced application(s) are incorporated herein by reference for all 
purposes. These prior applications, in some parts, may indicate earlier efforts at 
describing the invention or describing specific embodiments and examples. The present 
invention is therefore best understood as described herein. 

FIELD OF THE INVENTION 

The invention generally relates to methods and/or devices related to data 

transmission and presentation using a computing device or information appliance. More 
particularly, the invention in various aspects relates to multimedia data methods and 
applications and to a versatile file structure suitable for providing multimedia data and 
streaming data over a communication data network. 

BACKGROUND OF THE INVENTION 

Previous works related to aspects of the present discussion include below listed 

reference by Blakowski et al. and by the WWW Consortium. The Blakowski reference 
provides a survey of research regarding media synchronization. The reference discusses a 
four-layer synchronization reference model. From the lower layer to the upper layer are 
Media Layer, Stream Layer, Object Layer, and Specification Layer. A variety of 
techniques have been published and implemented in the specification area. SMIL is one 
of the latest efforts in the industry to provide a presentation specification using 
descriptive tagged language, which follows the syntax of XML. 

Existing and popular media container formats for computers include the 
QuickTime from Apple Computer (http://www.apple.corn), ASF/AVI from Microsoft 
(http://www.microsoft.com), RealMedia from RealNetworks (http://www.real.corn), and 
Video Capsule from Digital Lava (http://www.digitalIavacorn). Some of these such as 
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RealMedia and Video Capsule do not have published technical specifications. Some 
existing formats, such as .exe video files, provide a means for distributing media clips that 
are executable and viewable in a standard operating system without the need for installing 
additional software. In such a file, necessary software components are included in a 
5 single file with a name like movieclip.exe. Generally, such a file is not capable of playing 
streaming data or of varying the components downloaded. 

Downloadable machine code technology (e.g. applets) is also known. Sun 
pioneered aspects of downloadable executable components by creating Java, which has 
become used to make software components downloadable through web browsers to run 
10 on a local virtual machine. Microsoft's COM technology is a later developed technology 
ready to be downloaded from Internet and run in a document container, such as a web 
browser. 

The Internet is a rapidly growing communication network of interconnected 
computers around the world. Together, these millions of connected computers form a vast 

15 repository of hypermedia information that is readily accessible by users through any of 
the connected computers from anywhere and anytime. As there are an increasing number 
of users who are connected to the Internet and surf for various information, there is 
tremendous demand for more content to be available and methods to deliver content on 
the Internet. Currently, the most commonly available media information that is 

20 deliverable on the Internet may include text information^ images and graphics, videos and 
audio clips. 

One known format for this data is streaming data, often used for video and/or 
audio data. For example, streaming video is a sequence of encoded images sent over a 
network or other communications media and presented at a user device, with presentation 

25 of part of the data overlapping some of the receiving of the data. With streaming data, a 
user does not have to wait to completely download a large file before accessing that file 
or parts thereof (e.g. seeing video or hearing sound). As will be understood in the art, 
streaming data, such as video data, can be "live" data, such as streaming encoding of a 
radio or television broadcast, or can be previously stored data, such as pre-recorded audio 

30 or video data. The designation "real-time" is sometimes used in the art to indicate 
streaming data generally, included pre-stored media components. This term is used to 
indicate that delivery of the data is time-critical in that once playback begins, the 
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streaming data must be delivered quickly enough to keep up with "real-time" presentation 
of the data to the user. Such data is sometimes also referred to in the art as "clocked 
data" to indicate that the data generally must be delivered at a nearly constant rate. Too- 
fast delivery can swamp the receiving end's buffer capacity. Too slow delivery will 

5 cause unacceptable halts or pauses in playback. 

To access streaming data, a user generally uses an access module. This module 
can be either software or hardware, with logic functions able to present the data in a form 
desired by a user. As examples, an access module could be a software player (such as 
RealMedia™) running on a compatible information platform. An access device may also 

10 be more dedicated hardware, such as a cell-phone capable of accessing certain types of 
streaming data, a PDA, a television with the necessary logic, etc. One difficulty with 
standard access methods is that with the increased improvement in video/audio 
compression technologies, access module logic may have to be updated frequently when 
new data formats are used. Further, for media presentations including different types of 

15 data, an access device (such as a general purpose computer or PDA) has sometimes to be 
loaded with several different access modules to access different media. 

SMIL and SMIL Players and Authoring Tools 

To facilitate the creation and distribution of multimedia content on the web, in 

June of 1998, the World Wide Web Consortium (W3C) announced the first specification 
20 of the Synchronized Multimedia Integrated Language (SMIL). This language provides 
some mechanisms for structuring various elements of multimedia content, including 
audio, text, image, and motion video and provides a way for web authors to schedule the 
playing of time-based media within HTML documents. 

A SMIL document according to the specification is actually an ASCII text file, 
25 which authors can edit with almost any text editor. SMIL documents generally resemble 
HTML files and references included components from external URL locations. SMIL 
uses XML to provide the mechanism to define SMIL markup tags for associating timing 
and positioning relationships between multimedia objects, so that audio, videos, images, 
and text can be synchronized. 
30 SMIL enables web authors to position (location, size, z-index ordering) visual 

media objects and assign temporal attributes (begin time, duration or end time) among 
one and another. SMIL provides capabilities to create interactive (via hyperlinks) 
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multimedia presentations similar to those on computer-based interactive CDs. SMIL 
media objects (addressable by a URL) can either reside locally or be distributed over the 
web. SMIL media objects can be updated independently and remotely. 

A SMIL Player is used to schedule multimedia presentation in a SMIL document, 
5 retrieving media objects on the web using URLs (Universal Resource Locator) described 
in the document, parsing the timing relationship among downloaded media, and 
consequently displaying the presentation based on the correct timeline. The Java SMIL 
Player is a Java applet-based player that follows a platform-neutral programming 
paradigm and makes use of the XML Parser and Java Media Framework (JMF). It can run 
10 within any popular browser without the need to download any plug-ins or ActiveX 
controls. 

A number of SMIL authoring tools have been proposed, which can generally be 
grouped into two categories: general purpose tools designed to create synchronized 
multimedia content for any SMIL-compliant players; and special purpose tools designed 

15 to create synchronized multimedia content for particular players. While these authoring 
tools are powerful, they still require proficiency in SMIL programming, which limits the 
range of users. Currently most SMIL authoring tools in the market have several properties 
in common: they include a text-based editor; they include a WYSIWYG (What You See 
Is What You Get) layout editor; and they do not include a media timeline design wizard. 

20 T.A.G. SMIL Editor 1 .0, for example, is a general purpose SMIL authoring tool, which is 
designed to create synchronized multimedia content for any SMIL-compliant players. 
SuperTool is a specific purpose SMIL authoring tool for Real G2 Player only. SuperTool 
has a point and click WYSIWYG layout interface. Developers can add available media 
types, arrange media's layout and sequence how they are played in SMIL composition. 

25 Prior Publications 

The following publications may be related to the invention or provide background 

information. Listing of these references here should not be taken to indicate that any 
formal search has been completed or that any of these references constitute prior art. 

Blakowski, G, and Steinmetz, R. A Media Synchronization Survey: Reference 
30 Model, Specification, and Case Studies, IEEE Journal on Selected Areas in 
Communications, Vol. 14, No. 1, pp. 5-35, Jan. 1996. 
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Synchronized Multimedia Integration Language (SMIL) 1.0 Specification, WWW 
Consortium, Jun. 1998, available at www.w3.org/TR/REC-smil. 

SUMMARY 

The approach of the present invention is different from aforementioned works in 

5 part because a Media Missile is open to and can carry different media specification's 
requirements. In one embodiment, not only actions and data embedded in the media 
objects are part of the Media Missile, the Media Missile can also carry executable binary, 
signature, and auxiliary information (which, in specific embodiments, may be pointed to 
by URLs). Unlike the above mentioned prior streaming techniques, which do not 

10 describe carrying various active media content, an executable binary discussed below can 
be carried in a Media Missile to make it ready for a runtime loader on a presentation 
device to run the media missile executable without need for other preinstalled software. 
Thus, in specific embodiments, Media Missile makes uses of operating system's launch 
sequence to run an included executable component and is self-contained for both of the 

15 software components and media data. 

Media Missile, according to specific embodiments of the present invention, can 
benefit from technological advances such as Java or COM, but does not rely on them. 

In specific embodiments, the invention involves a method and/or system for 
providing streaming content that can be activated from a single resource locator and in 

20 response to an activation, begin transferring a streaming presentation in a multimedia 
container, wherein said container has the capability of including one or more executable 
code components, one or more streaming content components, and one or more sets of 
presentation data. A Media Missile launcher in further embodiments may receive from a 
user device data indicating available hardware and software components at said user 

25 device and this data may be used by a scheduler/parser at the user device to request or not 
request different components from the media missile repository. Therefore, a media 
missile can be transmitted with some executable code or no executable code, depending 
on which components are present at the user site. 

In a further embodiment, presenting a media missile according to the invention is 

30 self-directed and requires minimal active management by or presence of preinstalled logic 
components at a user device. 
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In further specific embodiments, the invention involves a method and/or system 
•allowing a creator (or author) to specify a streaming presentation including receiving 
indications from an author specifying one or more items of streaming content and 
specifying presentation parameters. The content and parameters may be located at a 

5 server remote from the author. 

The invention and various specific aspects and embodiments will be better 
understood with reference to the following drawings and detailed descriptions. In 
different figures, similarly numbered items are intended to represent similar functions 
within the scope of the teachings provided herein. In some of the drawings and detailed 

10 descriptions below, the present invention is described in terms of the important 
independent embodiment of delivering visual and/or audio content. This should not be 
taken to limit the invention, which, using the teachings provided herein, can be applied to 
other streaming data content. For purposes of clarity, this discussion refers to devices, 
methods, and concepts in terms of specific examples. However, the invention and aspects 

15 thereof may have applications to a variety of types of devices and systems. It is therefore 
intended that the invention not be limited except as provided by the attached claims and 
equivalents. 

Furthermore, it is well known in the art that logic or digital systems and/or 
methods can include a variety of different components and different functions in a 

20 modular fashion. The following will be apparent to those of skill in the art from the 
teachings provided herein. Different embodiments of the present invention can include 
different combinations of elements and/or functions. Different embodiments of the 
present invention can include actions or steps performed in a different order than 
described in any specific example herein. Different embodiments of the present invention 

25 can include groupings of parts or components into larger parts or components different 
than described in any specific example herein. For purposes of clarity, the invention is 
described in terms of systems that include many different innovative components and 
innovative combinations of innovative components and known components. No 
inference should be taken to limit the invention to combinations containing all of the 

30 innovative components listed in any illustrative embodiment in this specification. The 
functional aspects of the invention, as will be understood from the teachings herein, may 
be implemented or accomplished using any appropriate implementation environment or 
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programming language, such as C++, Cobol, Pascal, Java, Java-script, etc. All 
publications, patents, and patent applications cited herein are hereby incorporated by 
reference in their entirety for all purposes. 

The invention therefore in specific aspects provides a streaming video/audio 
5 signal that can be played on various types of video-capable terminal devices operating 
under any types of operating systems regardless what type of players are preinstalled in 
the terminal devices 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a block diagram showing an overall architecture of a system showing 

10 both authoring and playback according to various embodiments of the invention including 
multiple novel elements. 

FIG. 2 illustrates an exemplary configuration of a network in which the present 
invention may be practiced. 

FIG. 3 illustrates an example media missile container format according to specific 
15 embodiments of the invention. 

FIG. 4 is a flow chart illustrating functional aspects of a multimedia player 
according to a specific embodiment of the invention. 

FIG. 5 is a flow chart illustrating an example method for accessing a media 
missile according to a specific embodiment of the invention. 
20 FIG. 6 illustrates the operation of different timers initiated by a scheduler 

according to specific embodiments of the invention. 

FIG. 7 shows a standard browser display of example source files that may be 
packaged into a media missile in an example in accordance with specific embodiments of 
the present invention. 

25 FIG. 8 shows an example interface for packaging source data into a media missile 

according to specific embodiments of the present invention. 

FIG. 9 illustrates an example general metadata only media missile format 
according to specific embodiments of the invention. 

FIG. 10 illustrates an example metadata only media missile data structure 
30 according to specific embodiments of the invention. 
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FIG. 1 1 illustrates an example media missile data structure according to specific 
embodiments of the invention. 

FIG. 12 illustrates an example binary encoded metadata only media missile data 
according to specific embodiments of the invention. 
5 FIG. 13 illustrates an example presentation of a media missile in a player 

according to specific embodiments of the present invention. 

FIG. 14 illustrates an example general method for guiding a user in creating a 
media missile or presentation data according to specific embodiments of the invention. 

FIG. 15 illustrates an example of a Synchronized Authoring Manager (SAM), also 
10 referred to as a Synchronized Authoring Tool (SAT), according to specific embodiments 
of the invention. 

FIG. 16 illustrates an example of a Preview Manager GUI according to specific 
embodiments of the present invention. 

FIG. 17A-C illustrate examples of choosing various presentation templates 
15 according to specific embodiments of the present invention. 

FIG. 18A-B illustrate adding and previewing media according to a Video Lecture 
Template according to specific embodiments of the present invention. 

FIG. 19A-B illustrate adding and previewing media according to a Headline News 
Template according to specific embodiments of the present invention. 
20 FIG. 20A-B illustrate examples of creating presentation specification metadata 

according to specific embodiments of the present invention. 

FIG. 21 illustrates an example of a Document Generation GUI according to 
specific embodiments of the present invention. 

FIG. 22 is a block diagram showing a representative example logic device in 
25 which aspects of the present invention may be embodied. 

DESCRIPTION OF SPECIFIC EMBODIMENTS I 

In specific embodiments, the present invention involves a streaming media 

structure suitable for providing multimedia streaming over a communication data network 
including a cable network, a local area network, a network of other private networks and 
30 the Internet. The detailed description of the present invention include numerous specific 
details that are set forth in order to provide a thorough understanding of the present 
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invention. However, it will become obvious to those skilled in the art that the present 
•invention may be practiced without these specific details. In other instances, well known 
methods, procedures, components, and circuitry have not been described in detail to avoid 
unnecessarily obscuring aspects of the present invention. 

5 The present invention is presented largely in terms of procedures, steps, logic 

blocks, processing, and other symbolic representations that resemble data processing 
devices. These process descriptions and representations are the means used by those 
experienced or skilled in the art to most effectively convey the substance of their work to 
others skilled in the art. The method along with the system to be described in detail below 

10 is a self-consistent sequence of processes or steps leading to a desired result. These steps 
or processes are those requiring physical manipulations of physical quantities. Usually, 
though not necessarily, these quantities may take the form of electrical signals capable of ( 
being stored, transferred, combined, compared, displayed and otherwise manipulated in a 
computer system or electronic computing devices. It proves convenient at times, 

15 principally for reasons of common usage, to refer to these signals as bits, values, 
elements, symbols, operations, messages, terms, numbers, or the like. It should be borne 
in mind that all of these similar terms are to be associated with the appropriate physical 
quantities and are merely convenient labels applied to these quantities. 

According to one aspect of the present invention, a streaming media structure is 

20 described below. The streaming media structure is provided to create, deliver, 
reassemble, and render synchronized multimedia content with its lightweight machine- 
aware components through streaming services. Each stream represents a user access to a 
video/audio title provided in a host from a terminal device. < 
FIG. 2 illustrates an exemplary computer communication configuration in which 

25 the present invention may be practiced. Central server 102 together with a video/audio 
database 104 is a video/audio source comprising video/audio files that can be accessed on 
demand. As used herein, video/audio files or titles are referred to any video footage, 
video films and/or video/audio clips that can be compressed or packed in any format. One 
of the exemplary formats is Moving Picture Experts Group (MPEG). To play MPEG 

30 video files, one needs need an MPEG viewer or a client software executing in a personal 
computer with sufficient processor speed, internal memory. Another one of the popular 
formats for audio is MPEG-1 Audio Layer-3 (MP3), a standard technology and format for 
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compressing a sound sequence into a very small file (about one-twelfth the size of the 
original file) while preserving the original level of sound quality when it is played 
through a client program (player) downloadable from many web sites. It should be noted, 
however, that the exact formats of the video files do not affect the operations of the 

5 present invention. As will be noted and appreciated, the present invention applies to any 
formats of the video or audio files as well as other multimedia data that may include, but 
not be limited to, binary data and files, hypertext files and scripts. 

Preferably, data network 106 is a data network backbone, namely a larger 
transmission line. At the local level, a backbone is a line or set of lines that local area 

10 networks connect to for a wide area network connection or within a local area network to 
span distances efficiently (for example, between buildings). On the Internet or other wide 
area network, a backbone is a set of paths that local or regional networks connect to for 
long-distance interconnection. Coupled to data network A 106, there are two 
representative proxy servers 108 and 110 that each service representative terminal 

15 devices 1 16-119 via data network 1 12 and 1 14 respectively. It should be noted server 102 
sometimes named "central" or "central video" server does not necessary mean that server 
102 is only served as a repository of all the video titles. In some case, server 102 may 
service terminal devices directly and may retrieve other video/audio files from other 
servers such as proxy servers 108 and 110. 

20 Data network 112 and 114 are typically the Internet, a local area network or 

phone/cable network through which terminal devices can receive video files. The 
terminal devices may include, but not be limited to, multimedia computers (e.g. 116 and 
1 19), networked television sets or other video/audio players (e.g. 117 and 118). Typically 
the terminal devices are equipped with applications or capabilities to execute and display 

25 received video files. For example, one of the popular applications is an MPEG player 
provided in WINDOWS 98 from Microsoft. When an MPEG video file is received in 
streaming from one of the proxy servers, by executing the MPEG player in a multimedia 
computer, the video file can be displayed on a display screen of the computer. 

According to one aspect of the present invention, the streaming media structure is 

30 provided to create, deliver, reassemble, and render synchronized multimedia content with 
its lightweight machine-aware components through streaming services. 
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Cases 1 and 2 below present detailed procedures of how to construct the streaming 
media structure relying on a complied and linked version implementing the present 
invention and executing preferably in a client device. As such a video/audio title can be 
submitted to a host device in a format compliant to the streaming media structure. Cases 

5 3 and 4 below present detailed procedures of how to playback video/audio titles packed in 
a streaming media data structure according to embodiments of the present invention. 

It can be appreciated by those skilled in the art that one of the important features 
in the present invention is a new versatile streaming data file format or structure that can 
be accessed from any types of terminal devices operating under any types of operation 

10 systems. Another important feature is that the present invention, in specific embodiments, 
allows the elimination of preinstalled players. A streaming media structure according to 
specific aspects of the invention can include a built-in player that is downloaded on the 
fly and then operates in a local machine to access the streaming data and to find and 
install any necessary additional executable components. As will be understood to those of 

15 skill in the art from the teachings herein, elimination of preinstalled players for various 
types of streaming content allows streaming content to be more easily accessed on 
devices with limited code storage, such as cell-phones or other hand-held or versatile 
playback devices. 

The processes, sequences or steps and features discussed herein are related to each 
20 other and each are believed independently novel in the art. The disclosed processes and 
sequences may be performed alone or in any combination to provide a novel and 
nonobvious file structure system suitable for media delivery system. It should be 
understood that the processes and sequences in combination yield an equally 
independently novel combination as well, even if combined in their broadest sense. 
25 The streaming media structure as described herein, in accordance with one aspect 

of the present invention is robust, operationally efficient and cost-effective. The 
mechanism provides the best use of servers providing streaming services and permits 
seamless delivery of streaming video with highest quality of services possible. In 
addition, the present invention may be used in connection with presentations of any type, 
30 including sales presentations and product/service promotion, which provides the video 
service providers additional revenue resources. 
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In order to facilitate description, the following discussion will describe the present 
invention in terms of streaming media presentations. It will be understood to those of skill 
in the art, however, that the invention may be used in other streaming data situations. The 
invention should therefore not be taken as limited except as provided in the attached 
5 claims. 

In various embodiments, systems or methods according to various aspects of the 
present invention, can create, deliver, reassemble, and render synchronized streaming 
multimedia content with lightweight machine-aware components through streaming 
services. Each stream represents a user access from a client on one computer to a server 
10 on a different computer. As will be understood from the teachings herein, a system or 
method according to the present invention can handle various types of streaming data 
including real-time multimedia content as part of a presentation or multimedia missile 
through streaming services. 

In particular embodiments, an author (a user or a program) creates synchronized 
15 presentation using a wizard-like Synchronized Authoring Manager (SAM). In further 
embodiments, an SAM can include lightweight software components and/or indications 
of such components with the media data content into a Media Missile container that is 
streamable from a single location. 

The present invention, in some embodiments, thus allows that the streaming of 
20 active and passive content can be done all at once and can be self-contained and 
guidance-free. The active content is the various machine-aware runtime components. The 
passive content is the media data to be rendered at runtime. All of the presentation 
content (both active and passive) can be stored in a container-like file format or object- 
relational database. A special header can be attached to individual media objects to 
25 specify metadata information such as signature, data length (or offset), and linkage. A 
Presentation Specification can be generated in descriptive tagged language such as 
Synchronized Multimedia Integration Language (SMIL) from World-Wide Web 
Consortium (W3C). 

A container-like file format of a Media Missile has advantages for content 
30 providers in the convenience and portability of the content. With the flexible file 
specification of the Media Missile, a content provider of a synchronized Multimedia 
presentation can create the presentation and then store, manage, distribute, or serve the 
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presentation as easily as they would a single file. Thus, a synchronized multimedia 
presentation, using a media missile, can be as easy to deliver or manage as a standard GIF 
or JPEG banner ad. Conventional SMIL-based or HTML-based presentations cannot 
merge multiple objects into a single container and can require managing and downloading 
5 dozens of individual files to render properly. 

A media player implementation, such as a an SMIL player, that is compliant to 
this design can render and playback media contents in real-time in accordance with a 
presentation specification. In particular, individual media content objects, specified in the 
presentation, corresponds to object information in the metadata and media data in the 

10 media missile. 

In a further embodiment, a one-way one-to-one hashing function or other method 
can be used to generate a signature for the media. The signature can be both a watermark ( 
of the media, and also be a search key (to be used in place of a name) if the media is 
stored in a large distributed media repository. In a further important embodiment, a media 

15 missile format enables synchronized presentation on mobile platform with limited pre- 
installed software capability or memory footprint. 

FIG. 1 is a block diagram showing an overall architecture of a system showing 
both authoring and playback according to various embodiments of the invention including 
multiple novel elements. Various aspects and details of the designs are elaborated in later 

20 sections. The overall architecture shown in FIG. 1 includes the following functional 
and/or structural elements. 

Media Missile (MM) 12 is a container specification that allows inclusion of 
various items of machine-aware executable binary (or script), media data, and associated f 
information such as presentation specification or metadata. (MM) can be a container of 

25 container(s) in a nested structure. The machine-aware executable binary (or script) is can 
also be referred to as active media content. The active media content can be implemented 
in machine-neutral programming paradigm (such as Java) or machine-dependent code 
(such as .exe files). Media Missile with active media content is self-contained and 
guidance-free for synchronized real-time multimedia authoring and/or presentation over 

30 network environment. Once the media missile is indicated and the active media content is 
retrieved and loaded to a machine, it is ready to run by itself. 
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Synchronized Authoring Manager (SAM) 10 takes and understands various 
language specifications, formats, and templates in synchronized multimedia, including 
Media Missile. Language specification and templates such as SMIL-compliant ones are 
used as input to SAM's authoring wizard. In specific embodiments, an SAM wizard 
5 guides a user to compose a synchronized presentation and retrieve media data from Media 
Repository Manager 20. In one option, an end result of an SAM is a packaged Media 
Missile. As an option, an SAM 10 can also output a separate presentation specification 
(such as an SMIL presentation specification) that is compliant to international or industry 
standards. 

10 As shown in FIG. 1, SAM 10 can receive as inputs templates, media missile 

specification, multimedia language specification, and media data, which may be received 
from a media repository. Templates, as understood in the art, include data describing a 
general format, layout, or structure for a presentation. SAM 10 creates a media missile 
and/or presentation specification, which is stored in repository 20. 

15 Media Repository Manager (MRM) 20 handles storage and retrieval of real- 

time (streaming) and non-real-time data accessible by a client by activating (or 
"shooting") a media missile. The real-time data (or clocked media data) 22, in a specific 
example, can refer to media data or media missiles stored or broadcast from Streaming21's 
Media Server and Caster or other compatible video server platforms, which are 

20 collectively illustrated as data source 30. The non-real-time data is the presentation 
specification or any other auxiliary information stored on a web server or file server. In 
the figure, Data Store 40 illustrates generally one or more stores for additional digital 
data. 

As shown in FIG. 1, MRM 20 can output either directly to an SMP 50 or through 
25 a web or other front-end interface 55. Also, as shown, MRM 20 acts as a middle manager 
for communicating media data with SAM 10 and managing data from source 30 and data 
store 40. 

(MRM) 20 also can be understood as representing access to any media data that is 
available to SAM 1 0 and SMP 50, such as any authorized and available computer on the 
30 Internet. 

Synchronized Multimedia Player (SMP) 50 has an internal scheduler to process 
delay-sensitive data. It can also handle sequencing of static media data such as images 
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and text. It complies with the SMIL recommended by W3C. Hence, it is able to interact 
with web servers to retrieve and/or to utilize SMIL presentation specifications. In 
addition, an SMP can also work with a Media Missile locally on a single platform or in a 
distributed network environment. 
5 As shown in FIG. 1 , SMP 50 can receive data directly from MRM 30, or through 

interface 55. An SMP 50 can also be directed to receive media data directly from a data 
store 40. 

Media Missile 

FIG. 3 illustrates an example media missile container format according to specific 
10 embodiments of the invention. This example shows a very specific example bit- wise 
arrangement of various elements. It will be understood to those of skill in the art from the 
teachings herein that variations of the field and bit arrangements and formats exist within ^ 
the scope of the invention. Particularly, logic true may be encoded as a variety of 
different bit values and field lengths may vary, included using more than one bit for fields 
15 indicated as bit fields below. The container format shown in FIG. 3 is further described 
below. 

Media Missile 12 is a document container, which can include active content and 
passive media content. The active content is machine-aware executable binary or script. 
Passive content can include a presentation specification, metadata information, and the 

20 media data. The media data can be another Media Missile. Thus, Media Missile has a 
nested structural format. All of the content of a media missile can be stored in a file, 
directory structure, or database, such as on a server system or computer such as repository 
303. In a further embodiment, according to specific implementations, some or all ( 
components can be generated on the fly by a multimedia missile delivery system. 

25 The executable binary (or script) 200 and the presentation specification 210 are 

both optional. If machine-executable code is not present, E-Length 205 is zero. The 
Continuation Bit 202 tells what to expect for the follow-on Media Missile structure. If 
Continuation Bit 202 is one and E-Length 205 is zero, this indicates that the entire Media 
Missile starts from a presentation specification (210). If Continuation Bit 202 is zero and 

30 E-Length 205 is zero, it indicates that the entire Media Missile starts from MetaData 
information 220. Finally, if Continuation Bit 202 is zero and E-Length is "-1", the entire 
Media Missile consists of pure media data. 
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Continuation Bit 212 in front of P-Length 215 tells whether the metadata 
information is included in the media missile (1) or not (0). If no metadata information is 
available, the data right after Presentation Specification is the media data. 

If metadata information 220 is present, the first bit 222 tells whether the data right 

5 after the metadata information is another Media Missile (1) or just pure media data (0). 
In metadata information, the signature field 226 is 128-bits by default to store a digital 
signature using one-way one-to-one hashing function such as MD4 (Message Digest 4) or 
MD5 (Message Digest 5). The length of the signature field can also be implicit by the 
implementation to guard against trivial breakdown and guessing. 

10 Following signature field 226, MetaData 220 holds one or more length field(s) 

(such as 245 and 295) for media data. Each length field is 64-bit starting with a bit 
representing URL (1) or payload (0) (such as 242 and 292). In specific embodiments, 
each length field in MetaData 220 corresponds to a segment of data in media data 230. 
The actual data stored in 230 can be characters indicating a URL, or can be the media 

15 data. Segment(s) of data are stored in sequence according to the sequence of length 
field(s). Here, URL stands for Universal Resource Locator for World-Wide Web 
hyperlink; payload indicates media data for the media player. As will be understood in 
the art, however, URL generally can refer to a locator (such as a filename or database id) 
for any resource available via a computer or communications system. In specific 

20 embodiments, a URL or other locator can indicating streaming or "real-time" data 
sources, such as sources of live streaming data. 

As will be understood from this discussion and other descriptions herein, an 
important and innovative aspect of the Media Missile Container Specification is that once 
a user "shoots" the Media Missile from a particular location, from the perspective of the 

25 user, the contents of the Media Missile can be delivered apparently from a single location. 
In contrast, with formats such as a standard HTML or SMIL type page, a client at a user 
site may have to access dozens of different text, static image, executable, or audio and 
video files, from dozens of different locations, to complete a presentation. With a Media 
Missile, a client can access all of the included data from a single location, with the data 

30 delivered in the Media Missile format. This aspect also has advantages for data providers 
in that Media Missile formatted data is easier to manage, distribute, and deliver than is a 
presentation the requires many different files. 
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Examples of Authoring Media Missile 

What follows are descriptions of the operation of authoring manager 10 according 

to specific embodiments of the present invention. A wizard, as discussed below, is 
understood in the art as logic implemented processes that guides a user through a 
5 sequence of steps in an assisted and constrained fashion to accomplish a result, such as 
installing software or creating a particular document or file type. A template, as discussed 
below, is understood in the art as set of data specifying general format parameters that 
may be used to determine or guide the final format of one or more final files or data 
structures. Synchronization primitives are further described below and illustrated in FIG. 
10 6. 

Example Basic Authoring Method 

The steps to create or update a Media Missile (using Synchronized Authoring 

Manager (SAM) 10 are as follows. 

1. The Synchronized Authoring Manager (SAM) provides template selection for 
15 user to create a synchronized multimedia presentation. It can also read from an existing 

presentation specification for user to update the specification. 

2. SAM takes presentation specification and media data as input used to parse and 
create a tree representation internally. The tree is a top-down representation of the 
synchronization primitives. Each intermediate node is either a parallel or sequential 

20 synchronization primitives. SAM then retrieves or calculates the required fields according 
to the Media Missile specification for the leaf nodes (individual media) in the tree. This 
can be a pipeline process. 

3. SAM attaches system-required fields, e.g., a 64-bit trailer, to individual media ( 
specified in the presentation to become a basic Media Missile with media data only. 

25 4. SAM uses a depth-first tree traversal to calculate and create optional fields 

based on user input. The optional fields are feasible to apply to any node in the tree 
depending on user's needs. 

5. SAM uses a breadth-first tree traversal to traverse the entire tree and create the 
final Media Missile. 
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Sample Sequences of Authoring Interactions 

Case 1: Creating Synchronized Multimedia Presentation for a Guided Lecture: 

1 . A wizard helps author to pick up a lecture template or other template of the 

presentation skeleton through a point-and-click list, where corresponding layout of the 
5 template also displays. 

2. The wizard asks the author to pick up a lecture video and corresponding 
presentation slides in sequence. 

3. The author has a choice to use default duration for slide display duration or use 
MetaData Manager to customize the presentation timeline. 

10 4. If using the MetaData Manager, the wizard displays both the video and 

corresponding slides at a particular point in time. Author has the liberty to modify the 
duration setting for each slide. 

5. The wizard then prepares the final output, both the presentation specification 
and media object, in user- specified formats, e.g., SMIL for presentation specification or 

1 5 Media Missile for media object. 

6. The wizard asks author whether author wishes to keep the final output locally 
or upload it to a specific hosting server. 

Case 2: Updating Synchronized Multimedia Presentation for a Guided Lecture: 

1. Author uses wizard to pick up an existing presentation specification or media 

20 missile from a hosting server. 

2. The wizard asks the author to delete or add video and slides. 

3. The author has a choice to use default duration for slide display duration or use 
MetaData Manager to customize the presentation timeline similar to step 3 in Case 1 . 

4. The wizard repeats step 4 and 5 in Case 1. 

25 5. The wizard asks the author to apply the changes and to check if object deleted 

from the specification should also be permanently removed from the hosting server(s). 
6. The wizard repeats step 6 in Case 1. 

Example Basic Method for Playing Back Media Missile 

Playback of Media Missile according to specific embodiments of the present 

30 invention involves three potentially overlapping processes: delivering, assembling, and 
rendering Media Missiles and their contents. 
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Example Method for delivering a Media Missile are as follows: 

1 . User can use a browser to point-and-shoot the Media Missile. Alternatively, 
user can also use command line interface to launch a local self-contained Media Missile, 
or run a player with a Media Missile name as input. 

5 2. If the Media Missile contains a complete executable binary (or script), it 

generally should be able to load itself and run (e.g. it can be loaded by the native 
operating system of the target device). In specific embodiments, Media Missile takes 
advantage of the runtime loader characteristics available on most operating system 
platforms such as UNIX or Windows. For example, runtime executables or scripts are 

10 identified by a Magic Number (first two bytes in any file) on UNIX platforms. On 
Windows platforms, runtime executables and scripts are explicitly identified by a three- 
letter filename extension registered in the system registry (such as .EXE or .COM). A 
loader generally also checks to determine if the files are indeed executable. Other 
operating system platforms may have different mechanisms to load and launch runtime 

15 executables that may be used by a media missile. 

3. The Media Missile will be streamed and downloaded from an application server 
or a web server if it is at remote site. A browser for a web or application server is 
generally able to run or save a remote file. To make Media Missile compatible with these 
servers, a hyperlink retrieval of a Media Missile in a web-enabled page will implicitly 

20 stream only the machine-aware portion of the Media Missile. The machine-aware portion 
can then be launched and run on a local platform and manage requests for additional 
portions of the media missile. Because the Media Missile's executable footprint size is 
known ahead of time, at runtime the executable is able to manipulate the rest of the Media 
Missile according to its specification. 

25 Example Method for presenting a Media Missile are as follows: 

1 . The Synchronized Multimedia Player (SMP) can be locally-resident executable 

code or executable code that is downloaded or streamed with the Media Missile. SMP has 
a timeline-based scheduler to schedule the playback according to the presentation 
specification. The presentation specification again may be available within the Media 
30 Missile or from another source, such as through web page that is holding the Media 
Missile. 

2. SMP parses the presentation specification to build a presentation tree. 
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3. SMP fires the presentation from the top of the tree based on the synchronization 
characteristics (either parallel or sequential) of each intermediate tree node. For parallel 
nodes, SMP creates independent execution threads to handle each parallel task and each 
of these threads refers to a global time base. For sequential nodes, SMP can use a single 

5 execution thread to step through the sequential tasks. 

4. When SMP fires a leaf node, SMP locates and launches the media player and 
media data. 

If the link to the media data in the specification is a relative path, SMP will by 
default search the Media Missile itself, using the media data's signature or other identifier 
10 to see if the media data is available in the Media Missile. This searching can take place on 
a Media Missile either that exists already at a local machine or on a Media Missile still 
resident at a remote server. Various protocols are known for a streaming data client to 
make requests and receive responses from a streaming server and the protocols can be 
used to search for specific data within the Media Missile. Otherwise, SMP will follow an 
15 absolute path either specified in the Media Missile or from the root of the server where 
the Media Missile or presentation specification was streamed, to locate the media data. 

Regarding the player, SMP will look to determine if the player for the particular 
media data is located on its local machine. If the player is not located on its local 
machine, SMP will search for the player in the Media Missile or in other data store. 

20 Playing a Media Missile 

FIG. 4 illustrates generally the functional elements the enable MM playback 

according to specific embodiments of the invention. As shown in the figure, MM 

playback may access media and code data from a variety of locations. A media missile 

repository 303 can be understood as a primary data storage of media missile data, 

25 including passive data and active code data. The repository in specific implementations 
can be understood to reside on a server, accessible via a local network, wide area 
network, or the Internet. When residing on a server (as opposed to locally stored), 
repository can be understood to have capabilities of streaming session servers, including 
seek to end, search, play and pause capabilities in response to requests from a client. In 

30 some embodiments, the repository will generally store active and passive content needed 
to play a media missile on a variety of platforms, though only content needed for a 
particular platform may be downloaded from 303 by an SMP. 
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Local Code Resource 305 can be understood as those code components available 
"at the local device. For some devices, such as a computer workstation on a LAN, the 
local code resource 305 may be quite extensive. For such devices, a MM scheduler and a 
variety of players may be permanently or nearly permanently present at the local device 

5 and therefore not have to be transferred from a server to play a media missile. For other 
devices, such as a cellular telephone for example, there might be very little or no local 
code resource and all code will need to be downloaded from 303. 

Web or Application Server Repository 307 can be understood as storage for any 
active or passive data not in 303 or 305. As described herein, this data may include data 

10 referenced by a resource locator (such as a URL) within a Media Missile. An example of 
such data would be an HTML page or data used to complete a presentation, advertising 
banners or data from external sources, or any other data referenced in the medial missile ( 
or needed to complete playback. 

As elsewhere described herein, to begin playing a media missile, an executable 

15 binary or script component able to parse a MM is first initiated on a playback device. 
This executable, as described herein, may be initially loaded from 303, 305 or 307. An 
initial task, in particular instances, is to parse a presentation specification 402, if present, 
and then to locate and execute a Real-Time Presentation Scheduler 404. Scheduler 404 
then schedules the downloading (if necessary) and initializing of any players 406 needed 

20 to playback the actual media data. These players can include a variety of off-the-shelf 
media players, such as QuickTime™ or RealMedia™ players or the players can include 
proprietary players. As indicated in the figure, once scheduler 404 has initialized, it 
schedules the downloading and initialization of any players. Thus, the downloading and ( 
initializing of various player and player components is under control of the overall 

25 presentation schedule being run by the scheduler. 

As shown in the figure, scheduler 404 also determines the ending of a presentation 
and terminates the necessary components. 

The steps to render or playback a Media Missile are as follows: 

1 . Software (SW) checks if the Media Missile contains an initial executable code 

30 portion (binary or script) for the platform. If the checking is positive, go to Step 2. If the 
checking is negative, goto Step 3. This checking can be accomplished by software 
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registered at the target device to handle a MM file type or the checking can be self 
starting if the MM itself is an executable file or script. 

2. Since there are machine-aware components in the Media Missile, it can be 
launched on the fly, or it may be used as an enhancement of the SW already present of the 

5 system to render the media data within the Missile. 

3. In case there's no machine-aware executable within the Media Missile, the 
media missile or Operating System will trigger and launch a local SMP with default 
components to read media data from the Media Missile. 

4. SMP handles both barrier and event synchronization with elastic timers for 
10 synchronization point. 

SMP internally has to handle synchronization events according to the presentation 
specification 320. Certain specifications define precise timing sequence. Some others 
have relaxed schedule. Any playback of media(s) can be either sequential or concurrent 
(parallel). 

15 Sample Sequences of Playback Rendering 

Case 3: Playing Back a Synchronized Multimedia Presentation for objects in Media 
Missile format in a Web Page: 

1. Client (SMP) 50 accesses a synchronized multimedia presentation, e.g., 
formatted in a HTML/XML web page. 
20 2. The names/signatures of the media objects at particular points in time in the 

presentation timeline are passed to the player(s) in the web page through a player 
scheduler which is downloaded on the fly or resides on the local computing device. 

3. Each player starts to retrieve a media missile header corresponding to the media 
object name/signature. 

25 4. Each player inspects the header to verify the necessity of downloading and 

registering embedded lightweight software components in the media missile to render the 
media data. 

5. Each player realizes and loads media-specific rendering components at runtime. 

6. For a parallel start of rendering, the scheduler sets a barrier synchronization 
30 primitive, further description of which is provided below. 

7. Each player is enabled to play back and render media data according to the 
"Enable" signal from the player scheduler, the starting time, and the duration in the 
presentation specification. 
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Case 4: Playing Back a Synchronized Multimedia Presentation for objects in Media 
Missile format in a distributed environment: 

1. A self-contained guidance- free Media Missile, which is downloaded or stored 
in local storage, is launched. The beginning of the Media Missile is the player scheduler, 

5 which can be a script or a binary executable object compatible with the runtime platform. 
Hence, it would run by itself. 

2. The player scheduler then loads a presentation specification, which is attached 
after any present runtime executable code segment in the Media Missile. 

3. The names/signatures of the media objects and their position in the Media 
10 Missile at a particular point in time in the presentation timeline are passed to the player(s) 

specified in the Media Missile through the player scheduler. The player(s) (which can 
also be a Media Missile by itself) resides in a distributed network environment. Hence, 
the name of a media object can be a link to a Media Missile over the networks. ( 

4. The procedure then repeats step 3 to 7 in Case 3. 

15 FIG. 5 is a flow chart illustrating an example method for accessing a media 

missile according to a specific embodiment of the invention. 

Playback Timing 

To handle both of the sequential or parallel playback of media contents with either 
precise or elastic timers, an SMP defines the following states: started, initialized, played, 

20 paused, and stopped for each media player under its control. FIG. 6 illustrates the use of 
these states with the synchronization primitives and the timing requirements. SMP 
attached a timer for each synchronization primitive to control the duration. Each media 
player may also have attached timers to control both delay and duration. Whether the 
content has additional embedded timing information or not, it can be played back as if it ( 

25 is a continuous media, i.e., media with timing information. Thus, SMP turns some static 
media types into time-based media such as text or image. The implementation of the 
synchronization checkpoints in the Real-Time Presentation Scheduler at any point in time 
is elaborated as follows and as illustrated in FIG. 6. 

Bl. Sequential Playback with Precise Timer: 

30 In this case, the scheduler 330 pairs the started and the stopped time of any two 

media players 340. As such, a player may be launched, based on estimated latency, to 
cache (or buffer) media data before it starts to render the data. 
B2. Parallel Playback with Precise Timer: 
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In this case, the scheduler checks for barrier synchronization on started, 
initialized, played, and stopped states for all of the players. Each player may be toggled 
between played and paused states if delay timers are set within the player. 

B3, Sequential Playback with Elastic Timer: 
5 In this case, the scheduler pairs the started and the stopped time of any two media 

players. However, one must be stopped before another is started. Alternatively, it can be a 
single media player that takes sequential input scheduled in time. 

B4. Parallel Playback with Elastic Timer: 

In this case, the scheduler only sets barrier synchronization for started and stopped 

10 states for all of the players. Each player controls its own status and the scheduler ignores 
any events from players other than started and stopped events. 

As will be understood to those of skill in the art from the teachings herein, the 
synchronization of multiple players allows the presentation of multi-part media 
presentation. Case B2, for example, might illustrate a case where a presenter or lecturer 

15 is narrating a video clip or a slide-show presentation. The presenter will come to a point 
in the presentation where it is time to begin the video clip, the clip will begin, and the 
presenter audio or audio and video, will continue while the video clip proceeds. The 
video clip might occupy an entire screen space, or the clip might occupy a subwindow in 
the screenspace, while video of the presenter occupies a different window portion. As 

20 shown in FIG. 6, initialization of a Player2 can begin at some point during the activation 
of Playerl. Thus, Player2 or Player3 components need not be downloaded early and 
might not be downloaded at all if a user halts ends access to a presentation. It will be 
understood from the teachings herein, as discussed with respect to FIG. 3, that the format 
of the media missile allows a multi-player presentation to continue from a media missile 

25 location, downloading lightweight software components as needed according to different 
timer strategies as shown in FIG. 4. 

Example Application (WebAds) 

Further understanding of aspects of the present invention will be provided by 

considering the following example. This specific example is provided to illustrate 
30 various aspects of the present invention, but the invention should not be limited by any of 
the details provided herein. This example involves use of the herein described media 
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missile format, according to specific embodiments of the present invention, to enable a 
WebAds streaming presentation. 

Using A Packaging Utility to Package a MM 

As a simple example, a MM can be created from an existing HTML presentation 

5 as described below. In a particular embodiment, a package utility may be launched as 
follows: 

package webAdsUrl fwebAdsDir startingHtmFile] 
In this example, webAdsUrl indicates a URL pointing to an external HTML 
location, in case the remainder of the packaged file is corrupted or empty when the MM is 
10 accessed. webAdsDir indicates an absolute or relative path to specify a directory the 
includes files to be placed in the MM and startingHtmFile indicates an initial html file 
that will be used as the initial outline of an MM. As an example, consider a Web Ads 
directory that includes the following files: 
c:\webAds\s2 1 WebAds.htm 
1 5 c:\webAds\otherLinks.htm 

c:\webAds\logo.gif 
c :\web Ads\sponsor\abcCo j pg 
In this example, the package utility can be run with a command line such as: 

c:\s21\bin\package "http://www.abcCo.com" c:\webAds s21WebAds.htm 
20 This example can be further understood with reference to FIG. 7 to FIG. 13. FIG. 

7 shows a standard browser display of example source files that may be packaged into a 
media missile in an example in accordance with specific embodiments of the present 
invention. According to various specific embodiments of the present invention, and as ( 
will be understood from the teachings provided herein, this source data file could be any 
25 type of media data. Here, the example media data includes an HTML file and its 
included image files, including a GIF and a JPG file. The HTML file also includes a 
reference to another HTML file, otherLinks.htm, which could include any additional links 
or html-type data. 

FIG. 8 shows an example interface for packaging source data into a media missile 
30 according to specific embodiments of the present invention. A variety of different 
packaging tools can be used in different embodiments of the invention, including 
advanced wizards for creating MM as described herein. In this example, however, a 
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simple packaging tool is used to create WebAds, and this packaging tool requests just the 

four parameters illustrated. 

When playing such a file, a player can extract the packaged file and place it into a 

temp directory. If the file is an audio-only clip, a default HTML browser page can be 
5 displayed, or the starting html file (e.g. s21WebAds.htm, as shown in FIG. 13) can be 

displayed in the player window. When the player quits, it generally will remove the 

temporary files and temporary directory. 

If player.dat is missing, a player can use a centrally located file (such as 

www.Streaming21.com/ads) or show error and then stop the application. If player.dat is 
10 corrupted or player.dat contains no files, a player will use the default URL inside the 

player.dat, if possible. In the worst case, the player can display the default bitmap for 

Audio playback. 

FIG. 9 illustrates an example general metadata only media missile format 
according to specific embodiments of the invention. In this example, the file begins with 
15 metadata information. As indicated in the figure, the metadata includes one or more 
URLs for associated files with the streaming presentation. Following the metaData is the 
mediaDataBlock, which contains the actual media object data or locators for media object 
data. 

FIG. 10 illustrates an example metadata only media missile data structure 
20 according to specific embodiments of the invention. Note that in this example, the values 
shown at the top of the figure are assumed. Note that as indicated in the figure, all of the 
source files are incorporated into the media data area of the media missile file, with the 
metadata area indicating the identity of the portions of data in the media data area. 

FIG. 1 1 illustrates an example media missile data structure according to specific 
25 embodiments of the invention. Comparing this figure with FIG. 3 will show one specific 
data structure example of how a Media Missile according to specific embodiments of the 
present invention may be realized. Note that objects in 

FIG. 12 illustrates an example binary encoded metadata only media missile data 
according to specific embodiments of the invention. 
30 Optional processing for WebAd in a media missile allows ignoring an Executable 

Binary of Script portion and a Presentation Specification portion, if either are found in the 
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media missile as, for example, shown in FIG. 11. . An example of seeking in a media 

missile file to skip this data is as follows: 

OpenFile 

Seek to ending- 8bytes 
5 BOOL contination_bit = readlBitO; 

int64 eLength = readNextBits (63 ) ; 

SeekBackToBeginning ; 
if (eLength==-l) 

error // required metaDataBlock is absent 

10 else if (eLength) 

skipFirstBytes (eLength) ; // skip Executable Block 
if (p_blk_exist) 

{BOOL next_blk_is_meta_data = readlBitO; 
if ( »next_blk_is__meta__data) error; // required metaDataBlock is absent 
15 else 

{ int64 p_blk_len = readNextBits ( 63 ) ; 

skipFurtherBytes (p_blk_len) ; // skip Presentation Block 

} // Reach Meta Data 

20 FIG. 13 illustrates an example presentation of a media missile in a player 

according to specific embodiments of the present invention. In a further embodiment, a 

player can use a local ads.html page by default, instead of always needing to contact a 

server for the html page. The local ads.html can include references to other files, such as 

avi, gif, jpeg, html or similar files. In specific embodiments, these may be restricted to 

25 the same directory in which ads.html resides and its sub-directory. Ads.html and related 

files can be packaged into a media missile format, with, as an example, a name in the 

form 'player.dat 1 . 



EXAMPLE SYNCHRONIZED AUTHORING MANAGER IMPLEMENTATION 

An SAM according to specific embodiments of the invention provides a user- 

30 friendly GUI that allows an author to successfully complete a Media Missile (or, in some 
embodiments, just a well-formed Presentation Specification) by applying mouse clicks 
and keystrokes, with no programming background required. SAM provides a wizard- 
driven guide to help non-programming users generate multimedia presentations. SAM 
wizards generate MM based on a series of templates, so that the multimedia presentations 

35 created through SAM are well-formed and can be parsed and rendered by any compliant 
player. In further embodiments, SAM provides a tool to preview different media such as 
text, images, html files, audio and video clips and facilitates the generation of 
synchronization data between different media. SAM can be implemented in Java or other 
programming environments. FIG. 14 illustrates an example general method for guiding a 
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user in creating a media missile or presentation data according to specific embodiments of 
the invention. 

Template Support 

A SAM according to the present invention provides a set of different standard 
5 templates to allow authors to create presentations such as video lectures, headline news, 
and slide shows. 

For example, a video lecture template makes it easy for an author to generate a 
video lecture (video synchronized with HTML slides). 

A headline news template is designed for virtual briefings (video synchronized 
10 with text streams at bottom) in scenes such as: headline news — video with anchorperson 
and text containing headlines; online event — video with event introduction and text 
containing registration information; product advertisement — video with product 
advertisement and text containing pricing information. 

A slide show template is designed for illustrated audio (audio synchronized with 
15 images) in scenes such as: guided tour — narrated map with scenic photos to guide users 
through a site tour; process presentation — audio commentary to slides demonstration of a 
process; or online publication ~ sound book or magazine with reference materials in 
pictures. 

While some prior multimedia authoring tools supported only non-streamed data 
20 types, an SAM according to the present invention, allows inclusion of streaming media 
along with other media types. FIG. 15 illustrates an example of a Synchronized Authoring 
Manager (SAM), also referred to as a Synchronized Authoring Tool (SAT), according to 
specific embodiments of the invention. 

SAM Core Components 

25 In specific embodiments, SAM includes four components: Presentation Wizard, 

Preview Manager, Presentation Specification Data Manager, and Media Missile 
Generator. 

Presentation Wizard is SAM's kernel that provides an easy way to enhance 
disparate media into integrated and dynamic multimedia content. It guides users to create 
30 presentations in just a few steps. 

Preview Manager can display any allowed medium element. If the medium is 
time-irrelevant, such as an image, a text stream, or an html file, then the exact content 
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will be shown in the preview window. If the medium is time-intrinsic, such as an audio or 
video clip, then it will be played continuously from the beginning in the preview window. 
If the medium is from a live streaming source, such as a live camera or video feed, a live 
sample of the medium can be played in the preview window. The Preview Manager can 

5 be activated inside and outside the Presentation Wizard. Users can use the Preview 
Manager to preview the medium before incorporating it into a multimedia presentation. 

Presentation Specification Data Manager is responsible for creating 
Presentation Specification Data (referred to at times in this context as metadata, although 
herein it refers to data for the presentation specification and not the Media Missile 

10 MetaData described above) information and a presentation specification, which is used to 
synchronize different media within a multimedia presentation. Users can playback an 
audio or video clip or stream, meanwhile recording relevant timing information for static ( 
media based on the content of the clips. For example, a user wishing to create a 
presentation that displays a narrated map (using audio), showing some slides of scenic 

15 photos (using images), and access from a live outdoor camera, indicating current weather. 
With Presentation Specification Data Manager, the user can decide the timing point to 
show the next slide according to the audio content or to open a window to display the 
streaming (or live) data. The information is saved in a data structure and used later for 
creating the corresponding presentation content. 

20 Document Generator features automatic creation of a presentation specification 

and/or a Media Missile. The document's structure is based on one of the templates, and 
the media and timing information come from Presentation Wizard operation. The 
Document Generator can package the various media data with the presentation data to ( 
create and any necessary or executable content to create a Media Missile. As an option, 

25 the Document Generator can also output just an SMIL-type presentation specification. 

Creating a Media Missile Using SAM 

Step 1 - Choosing A Template 

At first, users choose the type of presentation they wish to create. Each template is 

designed with a specific purpose in mind. SAM can support templates such as: 

30 Video Lecture Template 

Video lecture usually consists of a video clip or a streaming feed from a streaming 

data source such as a live camera or continuously encoded television broadcast, 

-29- 

3NSDOC1D: <WO 012637BA1_I_> 



WO 01/26378 



PCT7US00/27634 



synchronized with HTML slides. The video can be any media format supported by a 
target Player. The HTML slides are any conforming HTML files or other files that can be 
handled by a player. In some embodiments, if the slides are originally in another format 
(such as Microsoft PowerPoint) they are first converted to HTML format using a "Save as 
5 HTML" command in PowerPoint. 

Headline News Template 

Headline news usually consists of a video clip or a streaming feed synchronized 
with text streams. The video clip can be any media format supported by a target player. 
Text streams are generally formatted ASCII text. In a specific embodiment, users can 
10 directly type any text they want in the second step. 

Slide Show Template 

Slide show usually consists of an audio clip or streaming audio feed synchronized 
with image slides. The audio clip can be any media format supported by a target Player. 
The image slides are image files in GIF, JPEG, or any supported format. 

15 Step 2 - Adding and Previewing Media (Preview Manager) 

The second step is to add media for the presentation. Users can activate the Open 

File dialog box to select the files or streaming sources they wish to add. Users can also 
activate the Preview Manager to preview any selected medium. In various situations, 
users can decide how to create presentation data by using either the Presentation 

20 Specification Data Manager or SAM's default setting. If the Presentation Specification 
Data Manager is chosen, then the wizard will activate the Presentation Specification Data 
Manager as the third step. If SAM's default setting is chosen, then SAM assumes, for 
example, that slides are displayed sequentially, each slide has the same duration and the 
first slide starts displaying at the very beginning of the presentation. Therefore, SAM can 

25 calculate the presentation specification Presentation Specification Data automatically and 
does not need to activate the Presentation Specification Data Manager. 

Step 3 - Creating Presentation Specification Data (Presentation Specification 
Data Manager) 

In this step, the Presentation Specification Data Manger is activated to create the 
30 presentation specification Presentation Specification Data. Presentation specification 
Presentation Specification Data is used to synchronize different media within a 
multimedia presentation. It is as simple as timing information when to change to the next 
slide (image, text, or html file) or when to open a window for live media stream, possibly 
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synchronized with the content of time-intrinsic media (such as video or audio clips). With 
Presentation Specification Data Manager, users can decide at which timing point a slide 
or window starts displaying based on the timeline of another playing audio or video clip. 
An example implementation of a video lecture, in which two slide files (formatted as 
5 HTML files) and a opening of a live streaming source are synchronized to an explanatory 
video clip based on the Presentation Specification Data file shown below: 
sld001.htm>1231646556 

sld002.htm>54643 12313 

livesource.s21>7634537334 

In the Presentation Specification Data file, the string before character is the 

indicator or handle of one slide file or a streaming source, while the string after character 

is the exact media time when the slide or streaming source should start displaying. 

10 Whenever the current media time exceeds one of the media times in the Presentation 

Specification Data file, the corresponding displayed is activated. Since the media times 

are associated with the timeline of one video clip, then the synchronization between the 

video and the slide files is implemented. 

Step 4 - Generating Media Missile Presentation (Document Generator) 

15 Finally, the wizard shows the summary information such as the template name of 

the presentation, file names or URLs of selected media, and the Presentation 
Specification Data for synchronization. After clicking Finish button, the Presentation 
Wizard sends all the information to a Document Generator. The Document Generator 
creates a SMIL presentation specification or MM document based on a selected template 

20 file. In one embodiment, it reads the template file line by line, parsing it, and collecting 
and then placing the corresponding media data, file names, URLs or presentation data 
into a Media Missile structure. 

SAM Component and Screen Implementation Examples 

FIG. 16 illustrates an example of a Preview Manager GUI according to specific 

25 embodiments of the present invention. This Preview Manager is a single window, split 
into a media preview area and a control area at the top of the window. A text field control 
contains the current medium file location. Users can choose a medium in any way know 
for selecting files or indicating locations or streaming sources (URLs) such as typing a 
new path inside the text field; or pressing the button labeled ... to select a location. The 
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selected path is then shown in the text field. After selection, users can press Preview 
button to start displaying the medium. 

According to specific embodiments, each Presentation Wizard step can be 
implemented as a dialog box. Each dialog box has three buttons in common. Users can 

5 click Back button to return to the previous wizard step or click Next button to go forward 
to the next wizard step. Users can click Cancel button to close the wizard at any step 
without generating a document. 

FIG. 17A-C illustrate examples of choosing various presentation templates 
according to specific embodiments of the present invention. 

10 The Adding and Previewing Media functionality can change based on the 

template selected. FIG. 18A shows the Video Lecture template Add and Preview Media 
screen. Labels such as "Add Video Clip", "Add Slide Files", "Video Duration", and 
"Each Slide Duration" are set based on the media support of this template. Users can 
add, remove, or preview media in this dialog box. Users can click Remove button to 

15 delete selected slide file in the list box. Users can also click Preview button to activate the 
Preview Manager for playing back the selected medium file. Figure 7-19 shows a selected 
HTML slides displayed in the Preview Manager. 

FIG. 19A illustrates the situation when Headline News template is selected in step 
one. Headline News consists of video synchronized with text notations, which come from 

20 direct input instead of text file. Therefore, a dialog box FIG. 19B is provided for direct 
text input and is activated when Add Text Notations button is clicked. 

Based on the template selection, GUIs can also present different Presentation 
Specification Data creation methods. FIG. 19 A, for example, shows "Using Presentation 
Specification Data Manager" radio button, which means the wizard will activate the 

25 Presentation Specification Data Manager dialog box as the third step. "Using Default 
Setting" radio button indicates that users just need to type in the duration data. The 
wizard assumes that each slide has the same duration and the first slide starts displaying 
at the very beginning of the presentation. Therefore, the wizard does not need to activate 
the Presentation Specification Data Manager and goes directly to the last step - step 4. 

30 Similarly to Step 2, the functionality of the Presentation Specification Data 

Manager changes based on the template selection in step one. FIG. 20A shows an 
example when users select the Video Lecture template in step one and choose some 
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media files in step two. The "Video Lecture" label is shown in the Presentation 
Specification Data Manager. Labels such as "Video File", "HTML Slide List", "HTML 
Slide Preview", and "HTML Slide File Name" are set based on the media support of 
Video Lecture template. The left part of the Presentation Specification Data Manager is 
5 the video monitor area. The file path is shown in the Video File text field, and the video is 
rendered and played by a Preview module. When the video starts playing, the 
MediaProgressMonitor activated in step two updates the video Current Time every 50 
milliseconds. 

The right part of the dialog box is the slide timeline setting area. Each slide is not 
10 only included in the HTML Slide List box, but also shown in the Slide Timeline 
Recording panel in order. Users can click Reset button to reset each TimeLine text field 
in the panel to 0.0 second and click Sort TimeLine button to sort the slides in timeline 
order. FIG. 20B shows the situation when users select the Headline News template in step 
one. 

15 As shown in Figure FIG. 20A, Slide Timeline Recording panel in Presentation 

Specification Data Manager is the GUI part dynamically generated according to the 
users' selection in step one and two. Each slide file corresponds to one small child panel 
that consists of a File Name text field, a Record button, and a TimeLine text field. Since 
the video timeline is the entire video lecture presentation timeline, users can click Record 

20 button to set the starting time of the corresponding slide based on the video content. 
When one Record button is clicked, the exact time is shown in the corresponding 
TimeLine text field and the following slide is shown in the HTML Slide Preview area. 

As a further option, according to specific embodiments, when Users click the Sort 
button, the slides in the Slide Timeline Recording Panel are sorted in timeline order, 

25 which means the slide with the earliest starting time shows in the first line of the panel 
and so on. Bubble sort is simple and effective; therefore, it can be used as a sort algorithm 
for the timeline. 

Because generating presentation is the last step in the wizard guide, presentation 
information such as template name, selected media files, automatically generated or 
30 manually generated (by Presentation Specification Data Manager) Presentation 
Specification Data is summarized in the Presentation Information text area. Meanwhile 
users can define the presentation title, author name, and copyright. When Finish button is 
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clicked, the Document Generator will get all the information and generate a presentation 
independently. FIG. 21 illustrates an example last step of a presentation generation. 

SAM can utilize live multimedia streams by using Real-time Transport Protocol 
(RTP) for streaming media objects, and Real-time Transport Control Protocol (RTCP) for 
5 controlling and managing remote stream objects like a traditional VCR does (play, stop, 
and pause). RTP is the Internet standard for the transport of real-time data including 
audio and video stream. RTP can be used for media-on-demand application as well as 
interactive services such as Internet telephony. RTCP is the control protocol designed to 
work in conjunction with RTP. 

10 Further Example Source File 

Below is one example of a SMIL-type presentation specification file that could be 

part of a media missile according to specific embodiments of the present invention and 
could be accessed through a synchronized media player according to specific 
embodiments of the present invention. To display this file on a target machine, for 
15 example, would require players such as video, audio, HTML text, and image players, 

each of which could be resident at a target machine or packaged in a media missile. 

<smil> 
<head> 

<meta name =" author" content ="S. Shim" /> 
20 <meta name= "copyright " content="SJSU" /> 

<layout> 

<root-layout background- color= " 192 , 192, 192" 

height="620"/> 
<region id= "videoregion" left= M 40" top="10" 
25 ~ height="10 0" " z-index="0" 

color="192, 192, 192"/ > 
<region id= " lectureregion " lef t = " 1 0 " top= " 22 0 1 

he ight = "200" z - index= " 0 " 

color="192, 192, 192"/ > 
30 <region id="slideregion" left="240" top="10 M 

he ight ="2 00" z-index="0" 

color="192, 192, 192"/> 

</ layout > 
</head> 
35 <body> 
<par> 

<video src="demo\gm.mpg" region= "videoregion" / > 
<seq> 

<html src="demo\carl .gif " begin="5s" dur="3s" 

40 region= " si ideregion" / > 

<html src= " demo\car2 . gif " dur= " 5s " region= " si ideregion" / > 
<html src="demo\car3 .gif" dur="5s" region="slideregion ,, /> 
<html src="demo\car6 .gif " dur=" 5s" region= " si ideregion" / > 
<html src="demo\car5.gif" dur="5s" region="slideregion"/> 

45 <html src="demo\car4.gif " dur="5s" region="slideregion"/> 

<html src= "demo\car5 .gif " dur=" 5s" region="slideregion"/> 
</seq> 



width="1200" 

width- "150" 
background - 

width="4 50" 
background- 

width= "250" 
background- 
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<seq> 

<html src="demo\gml .html " begin="6s" dur="6s" 

region= " lectureregion " / > 

<html sr c=" demo \gm2 . html " dur= M 6s" region= " lectureregion" /> 

5 <html src="demo\gm3 .html " dur= n 6s" region^ " lectureregion" /> 

<html src="demo\gm4 .html " dur=" 6s" region= " lectureregion" /> 

<html src="demo\gm5 . html " dur=" 6s" region= " lectureregion" /> 
</seq> 
</par> 
10 </body> 
</smil > 

Embodiment in a Programmed Information Appliance 

FIG. 22 is a block diagram showing a representative example logic device in 

which aspects of the present invention may be embodied. The invention can be 

15 implemented in hardware and/or software. In some embodiments of the invention, 
different aspects of the invention can be implemented in either client-side logic or a 
server-side logic. As will be understood in the art, the invention or components thereof 
may be embodied in a fixed media program component containing logic instructions 
and/or data that when loaded into an appropriately configured computing device cause 

20 that device to perform according to the invention. As will be understood in the art, a 
fixed media program may be delivered to a user on a fixed media for loading in a users 
computer or a fixed media program can reside on a remote server that a user accesses 
through a communication medium in order to download a program component. 

FIG. 22 shows an information appliance (or digital device) 700 that may be 

25 understood as a logical apparatus that can read instructions from media 717 and/or 
network port 719. Apparatus 700 can thereafter use those instructions to direct server or 
client logic, as understood in the art, to embody aspects of the invention. One type of 
logical apparatus that may embody the invention is a computer system as illustrated in 
700, containing CPU 707, optional input devices 709 and 711, disk drives 715 and 

30 optional monitor 705. Fixed media 717 may be used to program such a system and may 
represent a disk-type optical or magnetic media, magnetic tape, solid state memory, etc.. 
The invention may be embodied in whole or in part as software recorded on this fixed 
media. Communication port 719 may also be used to initially receive instructions that are 
used to program such a system and may represent any type of communication connection. 

35 The invention also may be embodied in whole or in part within the circuitry of an 

application specific integrated circuit (ASIC) or a programmable logic device (PLD). In 
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such a case, the invention may be embodied in a computer understandable descriptor 
language which may be used to create an ASIC or PLD that operates as herein described. 

Other Embodiments 

The invention has now been described with reference to specific embodiments. 

5 Other embodiments will be apparent to those of skill in the art. In particular, a user digital 
information appliance has generally been illustrated as a personal computer. However, the 
digital computing device is meant to be any device for interacting with a remote data 
application, and could include such devices as a digitally enabled television, cell phone, 
personal digital assistant, etc. 

10 While the forgoing and attached are illustrative of various aspects/embodiments of 

the present invention, the disclosure of specific sequence/steps and the inclusion of 
specifics with regard to broader methods and systems are not intended to limit the scope 
of the invention which finds itself in the various permutations of the features disclosed 
and described herein as conveyed to one of skill in the art. 

1 5 Furthermore, while the invention has in some instances been described in terms of 

client/server application environments, this is not intended to limit the invention to only 
those logic environments described as client/server. As used herein, "client" is intended 
to be understood broadly to comprise any logic used to access data from a remote system 
and "server" is intended to be understood broadly to comprise any logic used to provide 

20 data to a remote system. 

It is understood that the examples and embodiments described herein are for 
illustrative purposes only and that various modifications or changes in light thereof will 
be suggested by the teachings herein to persons skilled in the art and are to be included 
within the spirit and purview of this application and scope of the claims. 
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WHAT IS CLAIMED: 

1 . A method of accessing streaming media content comprising: 
initiating a media missile; 

executing a scheduler, said scheduler loading a presentation specification; 
5 under control of said scheduler, initiating one or more players; 

passing indications of media objects in said media missile to one or more players; and 
at said one or more players, under control of said scheduler, presenting said media 
objects from said media missile. 

2. The method of claim 1 further comprising: 

10 at said one or more players, retrieving one or more headers from said media missile 

corresponding to a media object indicated for said one or more players; 
at said one or more players, inspecting said one or more headers to determine the 

necessity of downloading lightweight software components from said media missile 

to render the media data; and 
15 at said one or more players, loading media-specific components at runtime if needed to 

access said media data. 

3. The method of claim 1 wherein said accessing is performed on a mobile platform 
with limited pre-installed software capability. 

4. The method of claim 1 wherein said accessing is performed on a mobile platform 
20 with limited memory footprint. 

5. The method of claim 1 wherein said accessing enables synchronized presentation 
on a mobile platform. 

6. The method of claim 1 further comprising: 

at said one or more players, setting one or more barrier synchronization primitives. 

25 7. The method of claim 1 wherein said one or more players plays media data 
according to an enable signal from said scheduler. 
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8. The method of claim 1 wherein said initiating comprises launching a media 
missile from local storage, said media missile including a initial player scheduler able to 
run on a local device's runtime platform. 

9. The method of claim 1 wherein said initiating comprises launching a media 
5 missile from remote storage, said media missile including a initial player able to run on a 

local device's runtime platform. 

10. The method of claim 1 wherein said initiating comprises accessing a presentation 
formatted in a web page. 

1 1 . The method of claim 1 wherein said scheduler is taken from said media missile. 
10 12. The method of claim 1 wherein said scheduler is preinstalled on an access device. 

13. The method of claim 1 wherein said presentation specification is loaded from said 
media missile. 

14. The method of claim 1 wherein one or more of said indications of media objects 
includes a signature. 

15 15. The method of claim 1 wherein said media missile is a document container 
including active content and passive content, said active content comprising machine- 
aware executable components and said passive content comprising presentation 
specification, metadata information, and media data. 

16. The method of claim 1 wherein said media data can include another media 
20 missile. 

17. The method of claim 15 wherein said active and passive content of said media 
missile can be accessed from a single streaming file location. 

18. A method of accessing streaming media content comprising: 
initiating a media missile; 

25 wherein said media missile is a document container capable of holding machine-aware 

active content and passive content in a variety of formats; and 
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passing indications of media objects in said media missile to one or more players. 

19. The method of claim 18 wherein said initiating comprises: 
using a browser to point-and-shoot said media missile. 

20. The method of claim 1 8 wherein said initiating comprises: 

5 using a command line interface to launch a local self-contained media missile. 

2 1 . The method of claim 1 8 wherein said initiating comprises: 
initiating a player with media missile name as input. 

22. The method of claim 18 wherein said initiating comprises: 

wherein said media missile contains a complete executable component and is able to 
10 load itself and run in accordance with runtime loader characteristics available on a 

particular operating system platform. 

23. The method of claim 18 wherein at least one of said one or more players is 
included in said media missile. 

24. The method of claim 18 wherein at least one of said one or more players is an 
15 external player. 

25. The method of claim 18 wherein said external player is an MPEG player. 

26. The method of claim 18 wherein a hyperlink retrieval of said media missile in a 
web-enabled page implicitly streams only an initial active machine-aware portion of said 
media missile with the remainder of said media missile streamed under control of said 

20 initial active machine-aware portion. 

27. The method of claim 1 8 wherein said accessing may be performed on a platform 
with limited pre-installed software capability. 

28. The method of claim 18 wherein said accessing is performed on a platform with 
limited memory footprint. 



-39- 



ISDOCID: <WO 012637BA1_I_> 



WO 01/26378 



PCT/US00/27634 



29. The method of claim 18 wherein playback of a Media Missile further comprises 
potentially overlapping processes of delivering, assembling, and rendering media missiles 
and their contents. 

30. A method of playing a synchronized multimedia presentation comprising: 
initiating a media missile; 

wherein said media missile is a document container capable of holding machine-aware 

active content and passive content in a variety of formats; 
initializing a synchronized multimedia player with a timeline-based scheduler to 

schedule playback according to a presentation specification; and 
passing indications of media objects in said media missile to one or more players. 

31. The method of claim 30 wherein said presentation specification is available 
through a web page. 

32. The method of claim 30 wherein said presentation specification is available from 
said media missile. 

15 33. The method of claim 30 wherein said synchronized multimedia player is resident 
on a platform prior to said initiating. 

34. The method of claim 30 wherein said synchronized multimedia player is in said 
active content of said media missile. 

35. The method of claim 30 further comprising: 

20 parsing said presentation specification to build a presentation tree; and 

firing said playing from the top of said tree based on the synchronization 
characteristics of intermediate tree nodes. 

36. The method of claim 35 further comprising: 

for one or more parallel nodes, creating independent execution threads to handle each 
25 parallel task, each independent thread referring to a global time base. 

37. The method of claim 35 further comprising: 
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for one or more sequential nodes, using a single execution thread to step through 
sequential tasks. 

38. The method of claim 35 further comprising: 

upon firing a leaf node, locating a correct media player and media data. 

5 39. The method of claim 30 further comprising: 

if a link to media data in said presentation specification is a relative path, searching 

said media missile for said media data; and 
if a link to media data in the specification is not a relative path, following an absolute 

path to locate the media data. 

10 40. The method of claim 39 wherein said absolute path is either specified entirely in 
said link or is initiated from a root of a server from where said media missile was 
streamed. 

41. A method of accessing a multimedia presentation comprising: 

checking a media missile to determine if said media missile contains an executable 
15 portion for a current platform; 

if said checking is positive, launching machine-aware components in said media 
missile on the fly or as an enhancement of local software to render media data 
within said media missile; and 
if said checking is negative, launching a local media player with default components to 
20 read media data from said media missile; wherein said media missile is a document 

container capable of holding machine-aware active content and passive content in a 
variety of formats. 

42. The method of claim 41 further comprising: 

initializing a synchronized multimedia player with a timeline-based scheduler to 
25 schedule playback according to a presentation specification; and 

passing indications of media objects in said media missile to one or more players. 

43. The method of claim 42 wherein said player handles both barrier and event 
synchronization with elastic timers. 
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44. The method of claim 42 wherein said player internally handles synchronization 
events according to a presentation specification. 

45. The method of claim 44 wherein a presentation specification can define precise 
timing sequence or a relaxed schedule. 

5 46. The method of claim 44 wherein a playback can be either sequential or concurrent 
(parallel). 

47. A media missile streaming document container comprising: 
a first location for active media content; and 

a second location for passive media content; 
10 wherein said active media content comprises machine-aware executable code and said 

passive media content can comprise presentation specification, metadata 
information, and media data. 

48. A streamable media missile file comprising: 
a first portion for holding executable content; 

15 a second portion for holding a presentation specification; 

a third portion for holding metadata; and 
a fourth portion for holding one or more media data objects. 

49. The device according to claim 47 wherein one or more of said media data objects 
can include another media missile, providing a nested format. 

20 50. The device according to claim 47 wherein said media missile file can be managed 
at a server and distributed from a server as a single file. 

51. The device according to claim 47 wherein said media missile file is generated on 
the fly at a server though is delivered to a user as a single streaming file. 

52. The device according to claim 47 further comprising: 

25 a first length field indicating if machine-executable code is present and the length of 

said code if present; and 
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a first continuation field indicating presence of other portions in said media missile 
file. 

53. The device according to claim 52 wherein: 

if said first continuation field is logical true and said first length field indicates a zero 
5 length, said media missile begins with said presentation specification; 

if said first continuation field is logical false and said first length field indicates a zero 

length, said media missile begins with said metadata; and 
if said first continuation field is logical false and said first length field indicates a 
negative value, said media missile comprises pure media data. 

10 54. The device according to claim 47 further comprising: 

a second length field indicating if a presentation specification is present and the length ( 

of said presentation specification if present; and 
a second continuation field indicating presence of other portions in said media missile 

file. 

15 55. The device according to claim 54 wherein: 

if said second continuation bit field is logical true, then metadata information follows 

said presentation specification; and 
if said second continuation bit field is logical false, then media data follows said 

presentation specification. 

20 56. The device according to claim 54 further comprising: 

a third length field indicating if metadata is present and the length of said metadata if 1 
present; and 

a third continuation field indicating whether data after said metadata is another media 
missile or media data. 

25 57. The device according to claim 56 wherein: 

said metadata comprises one or more media object length fields indicating length of 
one or more media data objects in said media missile. 

58. The device according to claim 57 wherein: 
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wherein said one or more media objects are stored in sequence according to a sequence 
of said media data length field(s). 

59. The device according to claim 56 wherein one or more media data length fields 
includes a field indicating whether a corresponding object is a resource locator or is a 

5 payload. 

60. A method of assisting a user in creating a synchronized multimedia presentation 
comprising: 

prompting said user to select a template for said presentation; 
prompting said user to select one or more media data objects; 
10 prompting said user to select an option for generating presentation specification data; 

and 

preparing final output, including a presentation specification. 

61. The method according to claim 60 wherein said final output comprises a 
multimedia missile that can include media objects, a presentation specification, metadata, 

15 and active components in a container deliverable as a stream from a single streaming 
source location. 

62. The method according to claim 60 wherein said final output comprises a 
presentation specification in a standard descriptive tagged language. 

63. The method according to claim 60 wherein said method enforces generation of a 
20 well-formed synchronized multimedia presentation. 

64. The method according to claim 60 wherein final output can include both precise 
and elastic synchronization primitives. 

65. The method according to claim 60 further comprising: 

prompting said user to modify duration settings for media data objects; and 
25 prompting said user to use a presentation metadata manager to customize a 

presentation timeline. 

66. The method according to claim 65 further comprising: 
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displaying the video and corresponding slides at a particular point to an user to assist 
creating a presentation. 

67. The method according to claim 61 further comprising: 

prompting said user to select whether to keep the final output locally or upload it to a 
5 specific hosting server. 

68. The method according to claim 60 further comprising: 

accepting presentation specification and media data as input to said authoring 
method. 

69. The method according to claim 60 further comprising: 

10 creating a top-down tree representation of synchronization primitives wherein each 

( 

intermediate node is either a parallel or sequential synchronization primitive; and 
determining required fields according to a media missile specification for terminal leaf 
nodes in said tree. 

70. The method according to claim 69 further comprising: 

15 attaching fields to individual media data specified in a presentation so that said 

individual media have the format of a media missile with media data only. 

71 . The method according to claim 70 further comprising: 

using a depth-first tree traversal to calculate and create optional fields based on user 
input; and 

20 using a breadth-first tree traversal to traverse the entire tree and create the final media 

missile. 

72. A system for streaming synchronized multimedia presentations comprising: 
a real-time data source providing a plurality of streaming media objects; 

a data store for holding and retrieving presentation specifications or media missiles; 
25 and 

a repository manager that transmit a streaming media missile to a target device, where 
the media missile can include packaged indications from said real-time data source, 
can include static media objects, and can include presentation specification, 
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73. The system of claim 72 further comprising: 

an authoring manager for receiving authoring specifications for a streaming 
presentation. 

74. The system of claim 72 wherein said authoring manager accepts indications 
5 provided by an author and using said indications prepares a well-formed streaming 

presentation in accordance with a predefined descriptive tagged language. 

75. The system of claim 72 wherein said repository manager handles transmission of 
executable components as said components are needed and are requested by a 
synchronized multimedia player said for accessing a portion of said streaming 

10 presentation. 

76. The system of claim 75 wherein said repository manager and said player are able 
to begin pre-fetching delay-sensitive media data. 

77. The system of claim 72 wherein data is delivered from said repository manager in 
a container that provides indications for machine-aware executable code, media data, and 

15 associated information. 

78. The system of claim 77 wherein said container can include one or more additional 
containers in a nested structure. 

79. The system of claim 77 wherein said container holds all active and passive content 
necessary to play a synchronized multimedia presentation. 

20 80. The system of claim 72 wherein said authoring manager understands various 
language specifications, formats, and templates in synchronized multimedia. 

81. The system of claim 72 wherein said authoring manager includes a wizard that 
guides a user to compose a synchronized presentation and retrieve media data from a 
media repository manager to produce a media missile. 

25 82. The system of claim 72 further comprising a synchronized multimedia player 
including an internal scheduler to process delay-sensitive data. 
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83. The system of claim 82 wherein said synchronized multimedia player can present 
presentations in accordance with SMIL presentation data. 
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{ // metaDataBlock 

{ bitl data_or_missile 
block is data 
int63 metaDataBlkLen; 
int 128 signature; 

// fileFormatld 
bitl payload_or_url ; 
int63 def aultUrlLen; 



FIG. 8 

Example METADATA block FORMAT 



// 
{ 



{ bitl payload_or_url ; 

in 1 63 f i 1 enameLen ; 

bitl payload__or_url ; 

int 63 f ileDataLen ; 
} f ileNameDataLenStruct [numberOf Files] ; 
mediaDataBlock 

char def aultUrl [def aultUrlLen] ; 
{ 



// in this example always 0; => next 

// sizeof (metaDataBlock) ; 
// 0x03010101 => player version 3.01, 
, version 1 
// always 1; => URL 

// always 0; 

// always 0; 



payload 
payload 



char filename [f ilenameLen] ; 
char f ileContent [f ileDataLen] ; 
} f ileNameDataStruct [numberOf File] ; 
} mediaDataBlock; 
// ending 
bitl p_blk_exist; 
int 6 3 eLength; } 
// be aware the byte order difference in file vs in memory 
// e.g. int value = READ ( ) ; File Hex Data "12 34 56 78"; 
// value in Hex is 0x78563412; 

FIG. 9 
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EXAMPLE METADATA BLOCK CONTENTS 
Assumed values 

Default URL, is "http://www.abcCo.com" strlen 20 

c:\webAds\s21WebAds.htm nameLen 13, size 2562 bytes // the 

main page 

c:\webAds\otherLinks.htm nameLen 14, size 1548 bytes 

c:\webAds\logo.gif nameLen 8, size 5874 bytes 

c:\webAds\sponsor\abcCo.jpg nameLen 17, size 18745 bytes 

Metadata Media Missile values 





Hafa or mi qq-1 *J p 

UuUQ WX ItllOOllC 




o 


-i Tit- £ *3 


IlltS ucLUcl taDlfLJJCil , 


/ / 




intl28 




/ / 


0x03010101 


U1LX 


payJ-faU Ui UX X , 


/ / 


1 


lnco z> 


UclaUlLUi x Lien , 


// 


20 


Dltl 


payioaa. or uri ; 


// 


0 


lUL D d> 


X lXCllalUCiJCill , 


// 


13 


D1C 1 


paylUau UX UX X , 


// 


0 




rixeijau alien x ; 


// 


2562 


DILI 


pay lUdu UX ux x , 


// 


0 


-i r-i C "7 


X X X cllallicJjcll^ , 


// 


14 


Dltl 


payload or url ; 


// 


0 


into j 


E 1 IcUaLdiJCliZ ; 


// 


1548 


Dl L 1 


payiuau ux ux x , 


// 


0 


■i nt- CI 
X 11L. O -J 


X X x ciJaiuciJCii j , 


ii 


8 


Dltl 


payioaa or uri ; 


ii 


0 


X 11 L. D .5 


X X X CUQ LdUCllJ , 


ii 


5874 


t- 1 
JJlll 


pay lUdu ux ux x , 


ii 


0 


mt 63 


f ilenameLen4 ; 


ii 


17 




naul naH or nrl • 

Uay XUQ.U Ui. Hi- -L , 


ii 


0 


int63 


f ileDataLen4 ; 


ii 


18745 


// At 


96 bytes; this is the 


media data 


char 


defaultUrl [20] 


II 


"http : / /www. abcCo . com" 


char 


f ilenamel [13] 


II 


"s21WebAds.htm" (simple 




encryption) 






char 


f ileDatal [2562] 


II 


s21WebAds .htm ■ s data 


char 


f ilename2 [14] 


II 


"otherLinks .htm" (simple 




encryption) 






char 


f ileData2 [2562] 


II 


otherLinks . htm 1 s data 


char 


f ilename3 [8] 


II 


"logo .gif " 


char 


f ileData3 [2562] 


II 


logo.gif 's data (simple 




encryption) 




" sponsor \abcCo. jpg" 


char 


f ilename4 [17] 


II 


char 


f ileData4 [2562] 


II 


sponsor\abcCo . jpg 1 s data 




encryption) 


II 




bitl 


contination_bit 


0 


int63 


eLength // 0 
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MEDIA MISSILE BLOCK CONTENTS EXAMPLE 



char executablecode [nnnnnn] -, 
bitl P_spec_contination_bit ; 
int63 pLength; 

char presentation_spec [pppp] , 
bitl data_or_missile 
int 6 3 metaDataBlkLen ; 
intl28 signature ; 
bitl payload__or_url ; 

de f aul tUrlLen ; 
pay load_or_url ; 
f i 1 enaraeLenl ; 
payload_or__url ; 
f ileDataLenl ; 
payload_or_url ; 
f ilenameLen2 ; 
payload_or_url ; 
f ileDataLen2; 
payload_or_url ; 
f ilenameLen3 ; 
payload_or_url ; 
f ileDataLen3; 
payload_or_url ; 
f ilenameLen4 ; 
payload_or_url ; 
f ileDataLen4 ; 



int63 
bitl 
int63 
bitl 
int63 
bitl 
int 6 3 
bitl 
int63 
bitl 
int63 
bitl 
int63 
bitl 
int63 
bitl 
int63 
// 
// 

//At eLength + pLength plus 104 
that 

// if the data_or_missile bit = 1, 
// media missile. In that case, 

above will 
// have the form: 

// bitl payload_or_url ; // 

// int63 MediaMissileLenl; // 

// and the media data will include the object 



// 
// 

// 
// 
// 

// 
// 
// 
// 
// 
// 
// 
// 
// 
// 
// 
// 
// 
// 
// 
// 
// 
// 



Executable code 
1 

// PPPP 
Presentation Specification 
0 

(64/8) * (4+4*2) = 96 bytes 
// 0x03010101 
1 
20 
0 
13 
0 

2562 
0 
14 
0 

1548 
0 
8 
0 

5874 
0 
17 
0 

18745 



bytes; this is the media data; or note 

this portion could begin another 
the words after the signature block 



0 

rnmwm 



// 
// 
// 



char mediamissile [jnmrn/n] 



char 


defaultUrl [20] 


II 


" ht tp : / /www . abcCo . com M 


char 


f ilenamel [13] 


II 


"s21WebAds .htm" (simple 




encryption) 






char 


fileDatal [2562] 


II 


s2 lWebAds . htm 1 s data 


char 


f ilename2 [14] 


II 


"otherLinks . htm" (simple 




encryption) 






char 


f ileData2 [1548] 


II 


otherLinks . htm 1 s data 


char 


f ilename3 [8] 


II 


"logo.gif " 


char 


f ileData3 [5874] 


II 


logo.gif's data (simple 




encryption) 






char 


f ilename4 [17] 


II 


" sponsor\ abcCo . j pg " 


char 


f ileData4 [18745] 


II 


sponsor\abcCo .jpg's data 




encryption) 






bitl 


continuation_bit 


II 


0 


int63 


eLength // nnnnnn 
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data objects 
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preparing final output data including a 
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B5 



if indicated by user, packaged final output 
data, including media objects and indications 
of media objects, presentation specification, 
metadata, and active components into a 

media missile container streamable from a 



single location 
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FIG. 16 Preview Manager GUI 
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gjjSAT Presentation Wizard — Choose Template 



This Presentation Wizard guide you through the process of creating a . munimedia presentation 



Layout used in template 



Template: 



Video Lecture 



These templates inlcude predefined track types 
and layouts setting. 



Play a video with slides. 

Suggested application: 
Video Presentation. 



Next> I j Cancel 



FIG. 17A Choosing a Video Lecture Template 



EH SAT Presentation Wizard — Choose Template 



This Presentation Wizard guide you through the process of creating a . multimedia presentation 



SMIL Layout used in template 



Template: 



Headline News 



These templates inlcude predefined track types 
and layouts setting. 



Play a video with text streams. 

Suggested application: 
News Broadcasting. 



Next> 



Cancel 



FIG. 17B Choosing a Headline News Template 
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Eg SAT Presentation Wizard — Choose Template 



This Presentation Wizard guide you through the process of creating a multimedia presentation 



Layout used in template 



Template: 



Slide Show 




These templates inlcitde predefined track types 
and layouts setting. 



Play an image with a soundtrack. 

Suggested application- 
Family vacation slideshow, 
product demo, 

corporate training presentation 



Next> 



Cancel 



FIG. 17C Choosing a Slide Show Template 



Eg SAT Presentation Wizard — Add Media 



Video Lecture 

Add Video Clip: 



F:\master\dance.avi 



- r 



Add-. 



Preview j 



Add Slide Files: 



Fr\mastertdocWance\sld001 .htm 
F:\master\doc\dance\sld002.htm 



! Remove 



Preview 



Set lime tor Media: 

' •> Using Metadata Manager 

O Using Default Setting 

video Duration: 
Each Slide Duration: 



seconds 
seconds 



<Back 



Next> 



Cancel 



FIG. 18A Video Lecture Add And Preview Media 
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Preview 



Dance - Slide One 

•romantic ; 
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FIG. 18B Video Lecture Template Preview Selected Slides 



[sH SAT Presentation Wizard — Add Media 
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Preview 
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Remove 



Set Time for Media: 
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Video Duration: 
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Preview 



<Bacft Next> j \ Cancel 



FIG. 19A Headline News Template - Add And Preview Media 
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|l|Add Text Notation for Headline News 



Add Text Notation for Headline News: 



First Jump! 
Beautiful Turn! 
Woo^ 



OK 
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FIG. 19B Add Text Notation For Headline News 



g§SAT Presentation Wizard — Generate Metadata 



Video Lecture 

Video File: iFrVnaster\dance.avi 




Current Time/ Total Time: 11 8.0/1 8.0 



HTML Slide List: 



HTML Slide Preview: 



F:\master\doc\danceU ^ | SlidC 1 WO^ 
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FIG. 20A Video Lecture Template - Create Presentation Specification 

Data 
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JI3SAT Presentation Wizard — Generate Metadata 



Headline News 

Video File: .F:\mastertv1deo.avi 





JJ 
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Current Time/ Total Time: !1 4.6/20.0 



Text Notation List: 



First JumpJOBeautiful . ▼ 



Sort Tone Line 



Text Notation Preview: 
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FIG. 20B Headline News Template - Create Presentation Specification 

Data 



EUSAT Presentation Wizard — Finish 



You have completed using the Presentation Wizard. Your multimedia presentation will now be created. 



Presentation Information: 



Video Lecture: 

Video File: F:\master\dance.avi 
Video Duration: IB.Osec 

Metadata: manually generated 

First HTML Slide File Begin at O.Osec 

HTML Slide File and Duration: 
F:\master\doc\danc e\sld001.htm 15.8sec 
F:\mastertdoc\d ance\sld002.htm 2.1 999998sec 



Presentation Title: 



VideoLecturet 



Author Name: 



Ying Wang 



Copyriglit: 



SJSU 



<Sack | finish 



Cancel 



FIG. 21 Video Lecture Template - Generate Document 
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^ needed machine-aware components through streaming services. 
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Streaming Service Enabled by SMIL Modification 
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Tomoyuki OHYA t , Minoru ETOH t > Regular Members, and Susie WEE'*, Nonmember 



SUMMARY In this paper, we present a CDN (Content 
Delivery Network) archi lecture for mobile streaming service in 
which content segmentation, request routing, pre-fetch schedul- 
ing, and session handofT are controlled by SMIL (Synchronized 
Multimedia Integration Language.) modification. In this architec- 
ture, mobile clients simply follow modified SMIL files downloaded 
from a portal server, these modifications enable multimedia con- 
tent to be delivered to the mobile clients from the best surrogates 
in the CDN. The key components of this architecture are I) con- 
tent segmentation with SMIL modification, 2) on-demand rewrit- 
ing of URLs in SMIL, 3) pre-fetch scheduling based on timing 
information derived from SMIL, and 4) SMIL updates by SOAP 
(Simple Object Access Protocol) messaging for session lumdoffs 
due to client mobility. This architecture enhances streaming me- 
dia quality for mobile clients while utilizing network resources 
efficiently and supporting client mobility in an integrated and 
practical way. The current, status of our prototype on a mobile 
QoS testbed "MOBIQ" is also reported in this paper. 
key words: CDN, mobile network, streaming media, SMIL 

1. Introduction 

With the birth and growth of the Internet and high- 
speed access links, Internet users can enjoy large, 
amounts of web content on the Internet. However, 
network congestion and server overload have become 
major concerns for content delivery. To mitigate these 
effects, CDNs (Content Delivery Networks) are attract- 
ing a great deal of attention. In CDNs, web content is 
distributed to cache servers located at the edge of the 
Internet close to clients, mid the cache servers deliver 
content to clients if they cached the requested content 
beforehand. Since the cached content docs not have to 
be delivered from the origin servers to clients, this leads 
to lower latency for users, better network resource uti- 
lization for network operators, and scalable* service pro- 
visioning for content providers. At the moment, several 
commercial companies provide CDN service on world- 
wide scales [1]. 

In mobile environments, Internet access has be- 
come very popular as represented by i-inode [2] and 
WAP (wireless access protocol) [3]. In addition, IMT- 

Mamiscript received September 9, 2002. 
Manuscript revised January 18, 2003. 
1 The authors are with Multimedia Laboratories, NTT 
DoCoMo, Inc., Yokosuka-shi, 239-8536 Japan. 

n The author is with Hewlett-Packard Laboratories, Palo 
Alto, CA, 94304-1126 USA. 

a) E-mail: yoshi^spg.yrp. nttdocomo.co.jp 



2000 (International Mobile Teleconmmnications-2000 ) 
has begun in Japan [4], and users can now access the 
Internet at up to 384kbit/s tlirough mobile devices. In 
the near future, mobile CDNs will be needed with the 
increasing number of IMT-2000 users. 

At the same time, multimedia streaming service* 
is becoming popular over the Internet. Mobile net- 
works are also expected to provide IP -based multime- 
dia streaming service in addition to web access service. 
From this requirement, 3GPP (3rd Generation Partner- 
ship Project), a standardization body of IMT-2000, has 
denned "packet streaming service (PSS) specifications 
[5)" as a standard for streaming service over 3G mobile 
networks. 

Streaming service, however, presents a lot of chal- 
lenges for network engineers. Unlike TCP applications, 
streaming service requires a certain amount of band- 
width to ensure the bit-rate needed by each media 
stream and the strict delay variation (i.e., jitter) ueode< I 
to avoid buffer underflow at streaming clients. For such 
streaming service, CDNs are a very effective solution 
to relieve network congestion and to keep jitter at tol- 
erable levels. Several papers have already considered 
how to cache streaming content effectively. In [G] [8] . 
streaming content is encoded with layered coding tech- 
niques and a proxy cache manages the number of lay- 
ers to be stored for each stream. [9] proposed a prefix 
caching scheme that stores only the beginning of the 
content to minimize storage while decreasing waiting 
time for playback. [10] also utilizes the idea of prefix 
caching and describes resource management issues on 
cache systems. Selective caching was proposed in [1 lj ns 
a generalization of prefix caching by storing various seg- 
ments of the stream, specifically those that are subject 
to causing buffer underflow at clients. In [12], streaming 
content is divided into variable-sized segments in order 
to improve caching efficiency. Content-aware segmen- 
tation and a cache system using video summarization 
were proposed in [13]. In this system, streaming con- 
tent Is segmented at key frames and users can start 
streaming from any segment while witching the key 
frames. 

Although these segment caching technologies have* 
potential for enhancing network resource utilization 
and caching efficiency, they do not describe how to 
manage the segmented content and how to specify tlu- 
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locations of the proxy caches. Especially in mobile net- 
works, since* network unci radio link conditions are dy- 
namically changed depending on the time and place, it 
is required to specify the best cache locations flexibly. 
In addition, the best cache locations could be changed 
during a streaming session when a mobile client moves 
from one location to another location. The mechanism 
to update the cache locations is necessary for a mo- 
bile streaming media CDN. In [14], a streaming CDN 
architecture that has content naming, content manage- 
ment, and redirection mechanisms is described, but it 
does not consider network dynamics and client mobility 
expected in mobile networks. 

The main goal of our work is to design a mo- 
bile streaming media CDN (MSM-CDN) that enhances 
streaming media quality in environments including mo- 
bile, while utilizing network resources efficiently and 
supporting client mobility in an integrated and practi- 
cal way. To achieve this, we propose a dynamic SMIL 
framework in which all of the CDN technologies includ- 
ing content segmentation, request routing, pre- fetch 
control, and session hand off, are controlled by SMIL 
(Synchronized Multimedia Integration Language) [15] 
modification. The reasons for focusing on SMIL arc 
itemized below. 

• Since SMIL includes content location information, 
client requests can flexibly be redirected to the best 
cache servers by replacing the original content loca- 
tions by the cache locations selected based on the 
network and caching conditions. 

• To improve caching efficiency, large streaming con- 
tent is likely to be divided into several segments. In 
this case, SMIL can easily manage the timing and 
spatial relations among these segments. 

• SMIL makes user access patterns more predictable 
because it provides timing information about when a 
client is likely to request each segment. This enables 
efficient pre- fetch control to cache servers. 

• Since SMIL is based on XML (extensible Markup 
Language) [16], XML-related technologies can be ap- 
plied to this system. 

• 3GPP PSS adopts SMIL as the scene description lan- 
guage. Our dynamic SMIL framework follows the 
3GPP-compliant specifications. This Is important for 
deploying this CDN system over 3G mobile networks. 

• Content providers can easily create SMIL files for de- 
scribing streaming content because SMIL is a stan- 
dard and popular scene description language. 

Figure 1 shows the concept of the dynamic SMIL 
framework. In this system, content providers regis- 
ter their streaming content with SMIL files describing 
the layout and media synchronization about the con- 
tent, and do not need to worry about the existence of 
cache servers. Mobile clients, on the other hand, down- 
load the modified SMIL files from a portal server in 
the CDN. By simply following the downloaded SMIL 




Fig. 1 Concept of dynamic SMIL Framework. 



files, the clients can receive multimedia content from 
the best cache servers. Even when mobile clients move 
to another location, the clients can continue multime- 
dia streaming from the best cache servers by updating 
the SMIL files. The key elements of this system are as 
follows. 

1. The CDN provides content segmentation functional- 
ity to enable segment caching and to improve cache 
efficiency without complicating management of the 
segments. It accepts content registered from con- 
tent providers aud divides the content into several 
segments if it is too large. At the same time, it mod- 
ifies the SMIL file corresponding to the segmented 
content to describe the timing relations among the 
segments. 

2. On-demand URL rewriting described in [17] is ap- 
plied to SMIL. That is, when clients request a SMIL 
file, the original content locations in the SMIL file 
are replaced by the cache server locations* and the 
modified SMIL file is sent to the clients. The clients, 
then, rtiquest session setup to the cache servers ac- 
cording to the modified SMIL file. 

3. To minimize the waste of the network resources and 
cache memory, streaming segments are scheduled to 
be pre-fetched just before clients request the seg- 
ments while the clients are receiving streaming seg- 
ments. The request times are derived from the tim- 
ing information described in SMIL riles. 

4. When mobile clients move from one location to an- 
other location, the clients request an update of the 
SMIL file. The portal server, then, updates the 
SMIL file to indicate the best cache server locations 
and creates the SOAP (Simple Object Access Proto- 
col) [18] message that includes the update informa- 
tion from the previous SMIL file. After the clients 
reproduce the updated SMIL file from the SOAP 
message, they continue the streaming session accord- 
ing to the updated SMIL file. 

The first three items enumerated above are effec- 
tive* in both wired and mobile networks though the last 
one is related only to 'mobile. 1 networks. Our proposed 
framework rovers all of thesr features in an integrated 
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way by SMIL modification. 

The rest of this paper is organized as follows. Sec- 
tion 2 explains onr architecture overview. Then, each 
key element to support our CDN architecture, i.e., con- 
tent segmentation, request routing, pre-fetch schedul- 
ing, arid session liandoff, is described respectively in 
Sect. 3 through Sect. 6. We report the current status 
of our prototyping work in Sect. 7, and finally, Sect. 8 
concludes this paper. 

2. Basic Architecture 

The MSM-CDN architecture we propose is shown in 
Fig. 2. This architecture consists of the wired core net- 
work and the radio access network. The CDN overlays 
the core network, and the CDN control plane is in the 
middle of the CDN. 

Tlie CDN is composed of portal servers, content 
servers, sinTogates, and a content location manager 
(CLM). The portal servers accept content registrations 
from content providers and provide content segmenta- 
tion and SMIL modification functionality. The content 
servers are simple HTTP (Hypertext Transfer Proto- 
col) [19] and RTSP (Real-Time Streaming Protocol) 
[20] servers and store a large mmiher of streaming seg- 
ments derived from the original media content. They 
do not have caching functionality, and work as HTTP 
sei-vers for the surrogates and RTSP servers for mo- 
bile clients. The surrogates are the RTSP servers that 
temporarily store streaming segments and seive mobile 
clients as streaming servers via RTSP session control. 
The CLM Is a server located in the CDN control plane, 
and it takes charge of managing content locations and 
scheduling pre-fetches. 

The mobile clients can connect to the CDN via the 
radio access network. They are the RTSP clients that 
request multimedia content delivery according to the 
SMIL files sent from the portal server. 

From the next section, we describe each key ehs 
ment of our architecture in more detail. 



Coo lent Location Manager 



CDN Control Plane 




SarTogcic 1 



Mobile Streaming Media 
CDN 



• Servers . 



Surrogate 2 



Wired Core Netwoit 



Client 



3. Content Segmentation 

Content segmentation is effective especially for large 
streaming content. By dividing content into several 
segments, surrogates can store and remove the content 
at the granularity of the segments, which leads to ef- 
ficient cache memory and network resource utilization. 
Prefix caching (9), selective caching [11], variable-sized 
segmentation [12], and content-aware segmentation [13] 
have been proposed as the intelligent variants of seg- 
ment caching. 

We also introduce content segmentation in onr 
CDN architecture to improve cache efficiency. The 
problem with content segmentation, however, is to com- 
plicate the management of these segments. In other 
words, content segmentation increases the numbe** »f 
the files to be managed. And to realize such intelli> 
caching schemes proposed in [!)]-[13], the servers in the 
CDN have to know the relations among the segments. 
Thus, how to manage the segments is an important is- 
sue. 

For this purpose, we hind a SMIL file to the seg- 
ments divided from a streaming content. Since SMIL 
provides the timing information for playing or display- 
ing, it can easily manage the timing relations among 
the segments. By parsing the SMIL file, mobile clients 
and other servers can determine which segments arc the 
next to the segments currently served and how long the 
segments will last. 

The procedure of content segmentation is shown 
in Fig. 3. First, content providers register their own 
streaming content with the portal server. Here, the 
content providers also attach the SMIL files descril>- 
ing the layout and media synchronization about their 
streaming content. By registering content, with SMIL 
fi les, content providers can specify how tin? could. . < 
displayed at mobile clients. If the size of the; registered 
content is larger than a certain threshold, the portal 
server divides the content into smaller segments. Al 
the same time, the portal server modifies the SMIL file 
attached to the content to describe the timing relations 
among these segments. Then, it sends the segments to 
a content server and requires it to store them. 

Then, let us see an example of SMIL modification. 
Here, there is a streaming content whose length is sixty 
minutes, and its SMIL file is shown in Fig. 4. The SMIL 
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Fig. 2 Basic architecture of mobile streaming media CDN. 



Fig. 3 Content segmentation am! SMIL morlifiratiuii at portal 
server. 
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shows that content, names are "content- A. mp4" for au- 
dio content and "content- V.mp4" for video content, and 
their durations arc sixty minutes. 

We assume the content is divided into three seg- 
ments, and each segment is ten, twenty and thirty 
minutes long, respectively. In this case, the SMIL 
file is modified at the portal server as shown in 
Fig. 5. According to the modified SMIL file, these seg- 
ments arc renamed as "content- A- {1, 2,3 }.mp4" for au- 
dio segments and "content- V- {1, 2 «3}. in p4" for video 
segments, mid the durations of the segments are ten, 
twenty, and thirty minutes long from the top segments. 
Since the part located between <seq> and </seq> in- 
dicates that these segments are sequentially played and 
displayed from the top segments, the timing relation 
among them is evident. 

In this way, SMIL hies are modified along with con- 
tent segmentation, and the modified SMIL riles indicate 
the names, durations, and relations of the segments. 



<!D0CTYPE smil PUBLIC "-//W3C//DTD SMIL 2.U//EN" 
"hHp^Aww.w3.oiB/2(X)iySMIL20/SMIU0.dttr'> 
<smil xmlns=^ttp://mv\v. w3.org/200 1/SMIL20/Languagc*> 
<hcad> 
<layout> 

<rcgion id="v~ top=**5" lcft="5~ width="180" height-**! R(r7> 
</layout> 
</hcnd> 
<body> 
<pardur= ;f -60min w > 
<audio src-"rtsp://contcnt-scivcr/ton ten t- A.m p4~ /> 
<vidco src=*"rtsp://contcnt-scr\'cr/conlcnt-V.mp4" rcgion="v" t> 
</pai> 
</body> 

</smil> 



Fig. 4 Example of original SMIL filr*. 



<!DOCTYPE smil PUBLIC *-/AV3C//DTD SMIL 2.1V/EN"' 
•1itip:/A™-Av3.orB^(K)l/SMIL20/SMlL20.dld**> 
<smi I xmlns*'*htlp:/Avww.\v3 org/20l> 1 /SMJL2U/Languagc~> 
<licad> 
<layout> 

<rcgion td-"v"* lop=*'5" !cft=~5" \vidlh="l BO** hciglu=" 1 MT7> 
</lnyoul> 
</hcad> 
*:body> 
<seq> 
<par dur= M 10min , *> 
<audio src^'^rtsp-.Z/conlem-scrvcr/contcnt-A-l .mp4~ t> 
<video src=**rtsp://contcnt-scrvcr/contcnt-V-l .mp4*' rcgion=**v" f> 
</par> 

<par dur= w 20niin w > 
| <audio src="*rtsp://conicnt-scrvcr'content-A-2.inp4 M f> 

<vidco src e ***rtsp7/contcnt-scrvcr/contcn(-V-2.nip4~ rcgion="V~ t> 
</par> 

<par dur^Omin'^ 
<audio sn^"rtsp://contcnt-scrv'cr/contcnt-A-3.mp4" /> 
<vidco src=**rtsp://comcnt-scrvt:r/contcnl-V-3.mp4*' region -"V" > 
<Jpnr> 
</seq> 
</body> 

</smil> 



Fig. 5 Modified SMIL lilt: after crmti»nl st*Ri Mentation. 



4. Request Routing 

Request routing is the most important technology for 
CDN. Several request routing methods have already 
been proposed [17]. One popular method is DNS-based 
request routing. However, its resolution is domain level 
and, therefore, fine-grain request routing at object level 
is difficult. Another method is using RTSP REDI- 
RECT. Although it's simple, the REDIRECT message 
has to be issued from original content servers to mo- 
bile clients each time the clients request a streaming 
segment, and this leads to session setup latency. 

In our CDN architecture, we apply on-demand 
URL rewriting [17] to SMIL. Request routing is done by 
modifying SMIL files at the portal server when mobile 
clients request the SMIL files. In other words, the por- 
tal server replaces the content locations in SMIL files by 
the surrogate locations, and returns the modified SMIL 
files to mobile clients. The advantages of this request 
routing are: 

• Request routing is done before clients request a 
streaming session by RTSP. This enables faster pre- 
fetching described in the next section. 

• There is no extra procedure such as DNS-based re- 
quest routing or RTSP REDIRECT method to set up 
a streaming session between clients and surrogates. 
This leads to low latency of session setup. 

Figure 6 shows the request routing procedure by 
SMIL modification. 

When a mobile client requests a SMIL file to start 
multimedia content streaming (1), the portal server 
reads the SMIL file corresponding to the streaming 
content stored in its memory. Then, the portal server 
emeries CLM about the locations of the segments de- 
scribed in the SMIL file (2). At this time, the portal 
server notifies the estimated time when each segment 
will be requested, by parsing the timing information in 
the SMIL file. CLM utilizes this information for pre- 
fetching control described later. The clicut location (IP 
address) is also notified to CLM. 

CLM selects the best surrogates for each segment 



Content Location Manager 




Fig. 6 liuqurst rcmiing pnMiilurc* by SMIL mnfliHcnlinii. 
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<!D0CTYPE ami PUBLIC --//W3C//DTD SMIL 2. (WENT 
-hnpi//xvwwAv3.orsf200 1 /SMlL20/SMlL20.duJ~> 
<sraU xndns=-hnp'7/\NWV.\v3.orB^00l/SM IL20/Langnage'> 
<hcad> 
<Iayotil> 

<region id=V top-"5~ lcfr=-5~ widih=-|80~ height^ 1 80 7> 
</layout> 
</head> 
<bodv> 
<seq> 
<par dur=*M OmuT> 
<audio src=**rtsp-J/surroE*le2/conient-scrvcr/content-A*I jnpj" /> 
<video src=**rlsp^/surrogale2/conteDi-£trvtr/coDteii1-V-1 jnj>4 rcgion^'v" /> 
</par> 

«:par dur=~*20min"> 
<aiidio src«"rtsp J/5urrogftte2/content-server/contcnt-A-2jup4'" t> 
<vidcD src~rt5p^/surrogate2/conUDt-server/ciinteni-V-2 jnp4" regicm^v" /> 

</par> 

<par dur="30mm"> 
<audio src="rtqi://surrogftte3/contcat-server/conlcot-A-3jnp<" /> 
<vtdco src=~rlsp://siirrogaJe3/conlcnt-server/conlcn{-VOjnp4*" region ^"v" t> 
</par> 
</seq> 
</body> 

</smi)> 



Fig. 7 Modified SMIL file to indicate surrogate locations. 

based on the client location, network condition, caching 
status, surrogate load, and so on (3). In this paper, we 
do not specify how to select the surrogate locations. 
After the selection, it responds the locations of the se- 
lected surrogates to the portal server (4). 

The portal server, then, replaces the original con- 
tent locations in the SMIL file by the surrogate loca- 
tions (5). Figure 7 shows an example of the SMIL file 
whose content locations are modified to indicate the 
surrogate locations. The original URLs of the segments 
shown in Fig. 5 are replaced by the surrogate URLs. 
The locations of the segments could be different. In 
the SMIL file shown in Fig. 7, siUTOgatc2 is selected for 
the top four segments, while surrogate3 is selected for 
the last two segments. 

After the modification of the SMIL file, the por- 
tal server returns the modified SMIL file to the mobile 
client ((5). The mobile client receives the SMIL file and 
requests streaming sessions by RTSP according to the 
SMIL hie (7). The surrogate requested to start the 
streaming session delivers RTP (Real-time Transport 
Protocol) (21] media packets that include audio and 
video payloads to the mobile client (8). 

As long as the mobile client follows the SMIL file 
downloaded from the portal server, it always requests 
session setup to the surrogates selected by CLM. 

5. Pre- Fetch Scheduling 

To smoothly start streaming sessions, the surrogates 
have to pre-fetch the segments before the clients re- 
quest the segments. As described in Sect. 3, the modi- 
fied SMIL files provide the timing relations among the 
segments. By utilizing this information, the time when 
the clients will request each segment can be estimated. 
Based on the estimated times, pro-fetching is scheduled 
to he fiuished before the clients request the segments. 
Figure 8 shows the sequence of segment prc- 
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Fig. 8 Sequence or pro-fetching scheduled by content location 
manager. 
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Fig. 9 Pre-fetch scheduling table. 
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fetching. In our CDN architecture, CLM takes charge 
of pre-fetch scheduling. 

When the portal server queries CLM about seg- 
ment locations, it also notifies t he timing information 
for the segments derived from the SMIL file (1). Af- 
ter the surrogate selection is done based on the client 
location, network condition, caching status, and sur- 
rogate load (2), it replies the locations of the selected 
surrogates {3). 

At the same time, CLM creates pre-fetch schedul- 
ing table as shown in Fig. 9. The table is composed of 
session ID, client IP address, start time, pre-fetch sta- 
tus, and pre-fetch information for each segment. Pre- 
fetch information includes estimated request times, seg- 
ment durations, segment names, original locations, and 
surrogates selected to serve the segment. 

Based on this table, CLM starts pre-fetclung. As 
shown in Fig. 8, the first audio and video segments an* 
pre-fetehed (5) while the client is requesting the seg- 
ments (6). Before the client finishes receiving the whole 
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first segments, the second segments are pre-fetchod (7). 
The third segments are prefetched to surrogato.3 while 
the client is receiving the second segments from surro- 
gated (9). 

If the client pauses the streaming session, the sur- 
rogate serving the client notifies CLM of the pause. 
CLM, then, changes the pre-fetch status in the pre- 
fetch scheduling tahle to PAUSE and stops pre-fetch 
requests to surrogates. When the client restarts the 
streaming, the surrogate notifies CLM of the restart. 
CLM recalculates the start time aud the request times 
for each segment, and changes the pre-fetch status to 
ACTIVE. The same procedure is taken when the client 
skips or goes back to the segments. 

6. Streaming Session Hand off 

When a mobile client moves from one location to an- 
other location, the best surrogate serving it could be 
changed. One way to redirect the client's request to 
the new surrogate is sending RTSP REDIRECT from 
the surrogate currently serving the client. In this case, 
however, the client will request the next segments from 
the obsolete surrogate if it is still guided by the SMIL 
file downloaded when starting the multimedia content 
delivery. Another way is client-driven session handoff. 
That is, the client disconnects the current surrogate mid 
requests the new surrogate for a session setup. The 
problem with this is how the client gets to know the 
best surrogate locations for itself. 

To enable session handoff for all of the streaming 
segments, the mechanism to update SMIL files is neces- 
sary when the best surrogates are changed due to client 
mobility. In our CDN architecture, a SOAP message is 
used for this purpose. When a mobile client moves to 
another location and the best surrogates are changed, 
the portal server sends the SOAP message that includes 
the locations of the new surrogates serving the client. 
The client, then, updates its SMIL file and continues to 
receive the streaming segments from the new surrogates 
by following the updated SMIL file. 

The advantages of sending SOAP messages to up- 
date SMIL files- arc: 

• The SOAP messages can explicitly notify the update 
of the current SMIL file. 

• If the SMIL file to be updated is elaborate for describ- 
ing the layout and if its size is very large, sending the 
updated SMIL file itself consumes network resources. 
In that case, the SOAP message that tells only the 
update information of the surrogate locations is more 
cost-effective. 

• Both SMIL and SOAP are based on XML. The same 
modules to parse XML documents can be used at 
mobile clients. In addition, most of the mobile clients 
are expected to be SOAP-capable in the near future. 
This requires only small modification on the clients. 



Content Local bo Manager 




Fig. 10 SMIL update procedure when client moves. 



Figure 10 shows the procedure of SMIL update 
when a mobile client moves. In this figure, we assume 
the best surrogates are changed from surrogate2 to sur- 
rogate 1 . 

When a mobile client moves from one location to 
another location and detects its movement, the mobile- 
client requests the portal server to update the SMIL 
file by the SOAP request message (1). After receiv- 
ing the update request, the portal server requests CLM 
to recalculate the locations of the best, surrogates (2). 
As similar to the case described in Sect. 4, CLM selects 
the best surrogates for each segment based on the client 
location, network condition, caching status, and surro- 
gate load (3). After the selection, CLM responds the 
locations of the selected surrogates to the portal server 
(4). 

The portal server, then, creates the SOAP re- 
sponse message to inform the client of the new sur- 
rogate locations for each segment (5). An example of 
the SOAP response is shown in Fig. 11. The parts lo- 
cated between <oldSrc> arid </oldSrc> indicate the 
locations to be replaced in XPath (XML Path Lan- 
guage) (22] description. The next parts located between 
<newSrc> and </ncwSrc> indicate the URLs of the 
new surrogates selected by CLM. In Fig. 11. the sur- 
rogates serving the second and the third audio/ video 



<SOAP-ENV:Envclopc 

xml nsSOAP-EN V=*TiUp://schcmas. xmlsonp.org/soap/cnvclopcT 
SOAP-ENV:cncodingStylc= 
k *hup://schcmas.xmlsoap.orfi/soap/cncodinB/'V 
<SOAP-ENV:Bodv> 
<m : u pdotc xm lnstm^littp V/docomo.nc j p/dynami cS M I L~> 
<oliiSrc>/smil/bod>/scq/parl2j/oudio/(fj;^rc</oldSrc> 
<ncwSrc>rtsp://surrogalc l/conicrtl-scrvcr/con lent- A-2.mp4</newSrc> 
<oldS rc>/smi l/body/scq/p nrl2Vvidco/tf }src</oldSrc> 
<newSrc>rtsp://su!Togaic I /conicnl-scrvcr/conlcnl-V-2.mp*l </ncwSrc> 
<DldSrc>/smn/bodyyscq/parf3)/audioA«{src</oldSrc> 
<newSrc>rtsp ://surrogatc 1 /conlcnt-scrvcr/contcn l-A-3 ,mp4 </n e wS rc> 
<oldSrc>/smil/body/scq/parl3|/vidcoA«!src</oldSrc> 
<ncwSrc>rtsp7/surrogatcl/conicni-scrvcr/conlcnt-V-3.mp4</ncwSrc> 
</m:updaic> 
</SOAP-ENV:Body> 
</SOAP-ENV:Envclopo 

Fig. 11 Example of SOAP message to update SMIL file. 
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segments are changed to surrogatel. 

The SOAP message created in this way is transmit- 
ted to the client (G). From this SOAP message and the 
current SMIL file, the client creates the new SMIL file 
that includes the new surrogate locations. The clieul 
continues to request the subsequent segments from the 
surrogates described in the updated SMIL file. 

While CLM responds the segment locations to 
the portal server (4), CLM also recalculates pre-fetch 
schedule and rewrites pre-fetch scheduling table. In 
addition, it may request the surrogate currently serv- 
ing the client (smrogate2) to redirect the client's re- 
quest to the newly selected surrogate (surrogatel) (7). 
Then, stirrogate2 issues RTSP REDIRECT method to 
the client (8), and the client requests surrogatel to seud 
the segments currently being transmitted from surro- 
gate2 (9). Note that surrogate2 does not need to is- 
sue RTSP REDIRECT methods for the subsequent seg- 
ments because the client requests surrogatel to deliver 
them directly according to the updated SMIL file. 

Tliis SMIL update mechanism may generate vast 
signaling traffic if a mobile client moves fast and fre- 
quently changes its location. Obviously there is trade- 
off relation between accuracy of best surrogate locations 
and the cost of the signaling traffic. In tins mecha- 
nism, the signaling cast can be flexibly controlled by a 
counter or timeout mechanism to prohibit the SMIL up- 
date for a while. And in an environment where multiple 
types of radio access networks (RANs) are connected, 
the change of the RAN type is a good indication for the 
SMIL update. That is. the best surrogate is expected to 
be changed with high probability when a mobile client 
changes its access network, for example, from 3G RAN 
to a wireless LAN. On the other hand, the best surro- 
gate is expected to be the same as before when a mobile 
client merely changes its access point in wireless LANs. 
Some mechanisms to trigger the SMIL update from net- 
work to mobile clients may also he possible and more 
effective if the network knows the coverage area of each 
surrogate. 



7. System Prototype 

We have built the first version of MSM-CDN over our 
mobile QoS testbed called "MOBIQ [23]" as shown in 
Fig. 12. MOBIQ consists of Diifscrv [24]-based IP core 
network and 3G RAN and IEEE802.il wireless LAN. 
The core network is composed of FreeBSD routers with 
ALTQ [25] module, which provides functionality needed 
by Diffscrv. The 3G RAN includes W-CDMA (Wide- 
band Code Division Multiple Access) RAN emulator 
[26] and header compression nodes [27] to enable effi- 
cient IP packet delivery over 3G wireless links. We have 
also developed streaming servers, live steaming servers, 
and mobile clients. All of these streaming servers and 
clients are compliant with 3GPP-PSS standard. 

The MSM-CDN overlay has been prototyped on 
this testbed. It includes the portal server, the cont( 
location manager (CLM), aud the surrogates. At this 
moment, only the request routing procedure shown in 
Fig. G has been implemented. The CLM always selects 
the closest surrogate to mobile clients by requesting the 
surrogates to measure round trip times (RTTs) between 
mobile clients mid themselves. 

One of the features of this prototype is that all of 
the interfaces between the above CDN nodes are im- 
plemented by SOAP interfaces. Since SOAP is based 
on XML, these interfaces can be easily extended to in- 
clude various security features including confidentiality, 
integrity, and access control. Another advantage of us- 
ing SOAP interfaces is that content providers or other 
enterprises connecting to the Internet can easily utilize 4 
these interfaces and the CDN functions because SOAP 
is widely used in the Internet. 

We have also developed live streaming splitters in 
our CDN prototype. In the live streaming case*, then* 
is no content to he cached on the surrogates. Inst( . 
of the suiTogates, the splitters receive live streams from 
the live streaming servers and copy the streams to I heir 
clients. The same request routing procedure can be 



Live servers 




Portal server 
— SOAP interfaces 



MoWe Chert 

Fig. 12 Prototype of MSM-CDN on mobile QoS testbed. 
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Pig. 13 Time needed for SMIL modification operation and 
RTSP session setup. 



applied to the live streaming. 

On this prototype, we can see better media quality 
with MSM-CDN than without it, when the bottle neck 
link in the core network is congested. However, mo- 
bile clients have to wait more for playback because of 
the SMIL modification operation including SMIL pars- 
ing/modification at the portal server and surrogate se- 
lection by CLM. Then, we measured the time needed 
for the SMIL modification operation as well as the time 
needed for SMIL parsing at clients and RTSP session 
setup in several test cases. 

Figure 13 shows the results of the above measure- 
ment. The test case 1 is a usual streaming case with- 
out CDN. In this case, approximately 700 ms was nec- 
essary for parsing SMIL at clients and RTSP session 
setup. The case 2 is a streaming case from the closest 
surrogate that has already cached the requested con- 
tent. Compared with the case 1, it took a little bit 
longer time because the SMIL modification operation 
was involved. The primary factor of this was the time 
For surrogate selection. However, we think the time is 
negligible because it was less than 200 ms. The case 
3 is that the requested content whose size is 5.4 MB 
had not been cached. Here, the bandwidth of the bot- 
tle neck link in the core network was 10 Mbit/s. In this 
case, the time needed for RTSP session setup was nearly 
seven seconds because the surrogate could not respond 
RTSP DESCRIBE message until it finished download- 
ing the requested content. On the other hand, the test 
case 4 where the content was divided into four segments 
shows that the time for session setup decreased by tol- 
erable level due to shorter download time of the first 
segment. And our mobile clients could receive and play 
back the subsequent segmeuts seamlessly. However, wc 
can see slight increase in the time for surrogate selec- 
tion. This is because the modified SMIL file became 
complicated and had more URLs in it by content seg- 
mentation. Therefore, segmentation of streaming con- 



tent into unnecessarily many segments may cause bad 
effect on the waiting time for playback. 

8. Conclusions 

We presented a mobile streaming media (MSM)-CDN 
architecture in which all of the technologies related to 
CDN are enabled by SMIL modification. In this ar- 
chitecture, mobile clients simply follow the SMIL file 
downloaded from the portal server, and this leads to 
multimedia content delivery horn the best surrogates 
in the CDN. The key components of this architecture 
are as follows. 

1. To improve cache efficiency, content segmentation is 
provided along with SMIL file modification to de- 
scribe the timing relations among the segments. 

2. To redirect client requests to the best surrogates, 
the content, locations in the SMIL file are modified 
to indicate the best surrogate locations. 

3. Segment pre-fetching is scheduled just before clients 
request segments in order to minimize the waste of 
the network resources and cache memory. The re- 
quest times are derived from SMIL files. 

4. SOAP messages to update SMIL files are defined. 
Tliis is necessary when mobile clients move aud the 
best surrogates for them are changed. 

The current status of our prototype was also re- 
ported. We have already built the MSM-CDN over our 
mobile QoS testbed called "MOBIQ." The MSM-CDN 
is completely compliant with 3GPP-PSS standard, and 
all of the CDN nodes have SOAP interfaces to control 
each other. We also showed that the time needed for 
the SMIL modification operation was negligible. 

We are now investigating segmentation and re- 
source management algorithms. These algorithms re- 
quire choosing a media segment size for the content seg- 
mentation operation. An appropriate? ly chosen segment 
size will consider client mobility, since each segment 
boundary provides an opportunity for a session hand- 
off. Segment size also influences content use statistics, 
since knowledge that users only tend to view the first 30 
seconds of a media clip can help the system efficiently 
allocate its storage resources. Resource management 
and caching algorithms can also benefit from knowl- 
edge of content popularity, since content that is likely 
to be accessed by other users should be left in cache 
storage whenever possible. 

Interaction with the underlying QoS network is an- 
other research topic. Notice that the pre- fetch schedul- 
ing table gives a deadline for when the segment must 
be pie- fetched to the surrogate, however it does not 
specify a precise time for the pre-fetch to occur. This 
flexibility allows the CLM and surrogates to schedule 
the actual pre-fetch time while taking the underlying 
QoS base network resource usage into account. 

We are also interested in personalized mobile 
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streaming. Since users liave limited devices and time 
especially in mobile environments, it is important to 
deliver multimedia content dynamically and automati- 
cally summarised depending on user preferences and/or 
network conditions. Towards this, personalization of 
SMIL files [28) and MPEG-7 based personalization [29] 
have already been proposed. We are considering the in- 
tegration of these personalization technologies into our 
MSM-CDN architecture. 
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